Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends

Over, Thomas; Marti, Mackenzie; Podzorski, Hannah

doi:10.3390/hydrology12050119

Open AccessFeature PaperEditor’s ChoiceArticle

Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends

by

Thomas Over

^1,*

,

Mackenzie Marti

¹ and

Hannah Podzorski

²

¹

Central Midwest Water Science Center, U.S. Geological Survey, Urbana, IL 61801, USA

²

Central Midwest Water Science Center, U.S. Geological Survey, Iowa City, IA 52240, USA

^*

Author to whom correspondence should be addressed.

Hydrology 2025, 12(5), 119; https://doi.org/10.3390/hydrology12050119

Submission received: 21 March 2025 / Revised: 2 May 2025 / Accepted: 9 May 2025 / Published: 14 May 2025

(This article belongs to the Special Issue Runoff Modelling under Climate Change)

Download

Browse Figures

Versions Notes

Abstract

Standard flood-frequency analysis methods rely on an assumption of stationarity, but because of growing understanding of climatic persistence and concern regarding the effects of climate change, the need for methods to detect and model nonstationary flood frequency has become widely recognized. In this study, a regional statistical method for estimating the effects of climate variations on annual maximum (peak) flows that allows for the effect to vary by quantile is presented and applied. The method uses a panel–quantile regression framework based on a location-scale model with two fixed effects per basin. The model was fitted to 330 selected gauged basins in the midwestern United States, filtered to remove basins affected by reservoir regulation and urbanization. Precipitation and discharge simulated using a water-balance model at daily and annual time scales were tested as climate variables. Annual maximum daily discharge was found to be the best predictor of peak flows, and the quantile regression coefficients were found to depend monotonically on annual exceedance probability. Application of the models to gauged basins is demonstrated by estimating the peak-flow distributions at the end of the study period (2018) and, using the panel model, to the study basins as-if-ungauged by using leave-one-out cross validation, estimating the fixed effects using static basin characteristics, and parameterizing the water-balance model discharge using median parameters. The errors of the quantiles predicted as-if-ungauged approximately doubled compared to the errors of the fitted panel model.

Keywords:

flood frequency; nonstationarity; climate variation; regression; water balance; midwestern United States

1. Introduction

Peak-flow quantiles are a key dataset for floodplain management and the design of critical infrastructure, including drainage systems and highway bridges. Possible nonstationarity (i.e., changes in the joint distribution of a sequence of random variables, thus including changes in the marginal distribution each time step, as well as in the inter-step dependence) in peak flows and other hydrologic time series have received considerable attention and sparked substantial debate in recent years [1,2,3,4,5,6,7,8,9,10,11,12]. Although observational evidence for widespread climate-change-based trends in peak flows is not strong [13,14,15,16,17], apparent nonstationarity on time scales of decades (which are thus relevant to design decisions) may arise from the internal variability of the climate system [18,19], and certain kinds of floods are deemed likely to increase, i.e., those more directly related to changes in precipitation extremes (those in small and/or urban basins) and perhaps the very largest floods [17,20]. In addition, trends in floods resulting from watershed changes such as urbanization are well documented [1,21,22,23,24,25,26,27,28,29]. Despite their importance for water management, methods for characterizing nonstationarity in peak flows and incorporating them into design have not been standardized [6,30]. For example, in the United States, Bulletin 17C [31] presents a sophisticated approach to stationary flood-frequency analysis, and while recognizing the importance of possible nonstationarity, it does not present a recommended methodology for addressing nonstationarity.

A range of methods has been proposed to account for the effect of climate variability and change on design floods [11,30,32,33,34,35,36,37]. Schlef et al. [38] classify what they consider to be the most common as follows: (1) extrapolation in time of observed trends in parameters of a selected flood-frequency distribution; (2) a chain-of-models approach, beginning with bias-corrected and downscaled projected climate data from General Circulation Model (GCM) simulations, which are then used to force hydrologic models of varying complexity (thus, an off-line analysis: there is no feedback from the hydrologic model to the GCM); and (3) a climate-informed approach: flood-frequency distribution parameters are fitted by statistical methods to time-dependent covariates describing atmospheric circulation patterns that have been shown to be more skillfully predicted by GCMs than local temperature and precipitation. All these methods use projections of future climate conditions; a more conservative approach, which avoids the uncertainty associated with projections of the future, is to estimate the flood frequency after updating to present climate and land-use conditions [25,26,27,39,40,41]. This approach has been most commonly applied in the case of trends induced by basin changes, especially urbanization; evidence for the validity of updating to the present in the presence of basin changes can be seen in the results of Luke et al. [42].

A critical aspect of the applicability of methods for the effect of climate change on design floods is whether the location of interest is gauged. If it is, single-station statistical methods can be applied directly using at-site peak-flow statistics [8]. If it is not, for a statistical method, the flood-frequency parameters need to be estimated, which may be performed by using a regionalization approach applied to selected basins that are believed to be hydrologically similar. For the chain of models approach, floods can be simulated at any location, but their accuracy is unknown except at gauged locations, and the use of regionalization to check and correct the values at ungauged locations again would be warranted. A common regionalization approach for estimating peak-flow quantiles under stationary conditions, especially in the United States, is the development of spatial regression equations that relate peak-flow quantiles and static basin characteristics. Such equations were introduced by Benson [43] as a generalization of the index-flood method [44,45]. Since their introduction, techniques for estimating the parameters of these equations, particularly when the peak-flow quantiles have been fit using a log-Pearson type III distribution as is recommended in the United States [31], have continued to be refined [46,47], and equations have been developed for most states in the United States, although not all have climatic predictors. Recently, it has been suggested [33,34] that when such spatial regression includes climate characteristics, e.g., annual mean precipitation, it could be used to predict the effects of climate change through the climate characteristic. Doing so would be to trade space for time, which is commonly performed [48,49], but may not be accurate [50,51]; one possible issue is omitted variable bias [52] (ch.3). In addition, the predictors were selected for spatial prediction and thus may not be the best choices for temporal prediction. Recently, however, panel regression, which uses pooling in space to estimate changes in time, thus avoiding the need to trade space for time, has been increasingly applied to estimate the effects of temporally varying covariates on streamflow statistics [21,25,26,53,54,55,56]. Most of these applications, although not all, have focused on the effects of land-use change, especially urbanization.

The applicability of panel regressions to ungauged basins depends on how the inter-basin heterogeneity is modeled. In panel regressions, this heterogeneity is modeled using intercept terms that are either estimated for each site from the time-varying data used to estimate the model coefficients (“fixed effects”) or they are assumed to be random variables independent of the time-varying predictors for which only the variance is estimated (“random effects”) [52] (ch.14), [54,56], and [57] (ch.14). Because, in a random-effects model, fewer parameters are estimated than with a fixed-effects model, it is more efficient than a fixed-effects model when the independence assumption is valid, and furthermore, static characteristics also may be incorporated into a random-effects model. These features make random-effects models directly applicable to ungauged basins; to apply a fixed-effects model to an ungauged basin, the fixed effect must be estimated by some means external to the regression model.

A standard panel regression analysis predicts the conditional mean response to the predictor; in flood-frequency analysis, interest is in the quantiles, and the response may vary with the quantile. In spatial regression, the flexibility to allow variation in the response by quantile is one advantage of the one-quantile-at-a-time regional regression methodology introduced by Benson [43] over the index-flood method [45]. One way to allow variation with quantile in the panel regression context is to analyze one quantile at a time, similar to what is performed in regional spatial regression, as Bassiouni et al. [54] did with low-flow quantiles. For peak-flow quantiles, predictions for quantiles with small exceedance probabilities that require extrapolation via fitting to a selected probability distribution are typically required for use in applications. To avoid such extrapolation, an alternative approach is to combine quantile regression [58], which predicts conditional quantiles rather than the conditional mean that is predicted in least-squares regression, with panel regression.

Even apart from a combination with panel regression, quantile regression has been applied somewhat rarely in flood-frequency analyses. One reason may be that it is nonparametric with regard to the probability distribution, at least in its basic form, which conflicts with typical flood-frequency methodologies (but refer to Frumento and Bottai [59] for a parametric version). There appears to be one previous application of quantile regression in a regional (spatial) analysis of flood quantiles [60] and one application to the dependence of flood quantiles on climate indices [61]. In their case study application in Quebec, Ouali et al. [60] showed that their quantile-regression-based method outperformed the traditional least-squares-regression-based approach when compared to the raw data using a loss function developed by the authors. Sankarasubramanian and Lall [61] compared quantile regression and a local likelihood method for seasonal flood forecasting and reconstruction of past flood series. Quantile regression has also been used in single-station analyses of peak flows [62,63,64,65], precipitation [66], and streamgage height [67] to estimate trends for different quantiles. The use of quantile regression in trend analyses is beneficial because quantile-dependent trends give more information than standard methods such as least-squares regression on time or the Mann–Kendall test, which give trends in the mean and median, respectively, although moment-based approaches to testing for changes in quantiles have been proposed [68]. Hecht et al. [40] also recently tested the use of quantile regression in updating flood quantiles to reflect nonstationarity.

When applying quantile regression to panel datasets, i.e., performing panel–quantile regression, an additional issue arises, whether the intercept terms that model inter-site heterogeneity, regardless of whether they are fixed or random, vary with the quantile. The first panel–quantile regression approach, that of Koenker [69], assumed the fixed effect was the same for all quantiles, which is also termed a pure location shift, and the model was solved by applying a shrinkage method to the fixed effects in which the sum of their absolute values scaled by a free parameter is included in the quantity to be minimized, while estimating all the quantiles at once. This approach has the downsides of being computationally demanding as well as introducing a free parameter that needs to be estimated. After Koenker [69], several authors presented other approaches to the panel–quantile regression problem assuming fixed or random effects (e.g., the review Galvao and Kato [70]). Among the fixed-effects approaches was that of Canay [71], who presented a so-called “simple” approach to quantile regression for panel data, which, like the shrinkage approach of Koenker [69], assumes the fixed effects are pure location shifts, but it is computationally much simpler and does not include a free parameter. More recently, Machado and Santos Silva [72] introduced a moment-based (i.e., one solved using least-squares) method, termed method-of-moments quantile regression (MMQR), for estimating conditional quantiles with panel data based on a location-scale model with distributional fixed effects that allow relaxation of the assumption of a pure location shift. To our knowledge, the only prior uses of panel–quantile regression analysis for modeling peak-flow quantiles is the analysis of the dependence of peak-flow quantiles on urbanization and climate by Over et al. [25,26], who used the approach of Canay [71], and that of Over et al. [27], who compared single-station quantile regression with the panel–quantile approaches of Canay [71] and Machado and Santos Silva [72], and who determined, for their study dataset, that MMQR was overall more accurate than that of Canay [71], and, in particular, that the distributional fixed effects of the MMQR method were statistically significant.

This study has three goals. The first is to evaluate the use of panel–quantile regression, specifically the MMQR method, for modeling regional flood frequency with nonstationarity driven by climate, by comparing the panel–quantile results to those from single-station least-squares and quantile regression. An MMQR model includes basin-specific fixed effects for the gauged basins to which it has been applied, limiting its direct application to ungauged basins; to overcome this limitation, a method to estimate these effects at ungauged basins using relations between the effects and static basin characteristics is demonstrated, allowing the model to be applied to ungauged basins in the study region. As a result, this method generalizes regional spatial regression methods [43,47] for making predictions of flood quantiles in ungauged basins without using the potentially inaccurate approach of trading space for time [50,51]. The second goal is to introduce methods for applying the model results to the flood-frequency distributions under nonstationarity for a selected basin, focusing on an approach to updating for observed nonstationarity. This approach generalizes the approach used by Over et al. [26] to update flood frequency for the effect of urbanization by introducing a method to apply it to a stochastic regressor, i.e., climate. It can also be understood as generalizing the distributionally parametric linear least-squares regression approach of Serago and Vogel [8] by being distributionally nonparametric and allowing for a nonlinear climate–time relationship. The third goal is to evaluate the efficacy of simulated discharge from a water-balance model as a climate predictor. Such a climate predictor may be understood as a hybrid between the chain-of-models and climate-informed approaches of the classification introduced by Schlef et al. [38], and, to our knowledge, such an approach has not been previously used except by Over et al. [27], who focused on the effects of urbanization and thus did not thoroughly evaluate the discharge from the water-balance model as a climate predictor. The water-balance model used has six parameters that are calibrated for the primary results of the study but do not necessarily need to be calibrated. This model was selected because it is the simplest type of model that includes the basic hydrologic processes that affect how runoff may change with shifts in climate, i.e., snow processes, evapotranspiration, and soil moisture. Because the climatic forcing series, i.e., temperature and precipitation, interact in river basin hydrology in complex and nonlinear ways (and are dependent on each other), we hypothesize that even a simple hydrologic model is preferable to using the climatic forcing series directly as predictors in the regression analysis. For the sake of comparison, precipitation alone is also used as a climate predictor, to determine the additional predictability that comes from using simulated discharge.

Following the fitting and comparison of the selected models and predictors, applications of the modeling approach are considered. General considerations related to use in gauged and ungauged basins are discussed, as well as, whether, in the case of a gauged basin, single-station or regional results are preferred. Then, an application to updating peak-flow records to present climate conditions, which gives a set of quantiles that account for nonstationary that could be used for design, is presented in detail. The paper concludes with a discussion and conclusions section summarizing and discussing the results in the broader context of the analysis and the application of nonstationary flood frequency.

2. Materials and Methods

In this section, the single-station and panel regression methods and the goodness-of-fit statistics used to evaluate and compare them are presented, along with the method used to compute temporal trends. The material in this section relies heavily on the Methods section of Over et al. [27], but full details are included here for completeness. A graphical depiction (flowchart) of the analysis methods used in this paper is presented in Figure 1.

2.1. Single-Station Regression

The baseline approach used in this study is to regress the log-transformed peak discharges on the explanatory variables, one station at a time. This was achieved using linear least-squares (LS) regression and linear quantile regression (QR) for selected probabilities.

Mathematically, the LS regression model can be expressed as:

f (Q_{i t}) = Y_{i t} = X_{i t}^{1'} β_{i}^{S S} + e_{i t}^{S S}, i = 1, 2, \dots, n, t = 1, 2, \dots, T_{i},

(1)

where the dependent variable

Y_{i t} = f (Q_{i t})

is the annual maximum peak flow

Q_{i t}

at station i in year t, where

f (\cdot)

is some monotonically increasing transformation such as

l o g (\cdot)

;

X_{i t}^{1'}

is the vector of explanatory variables (here, climate plus an initial 1 to estimate the intercept) for the ith station and year t;

β_{i}^{S S}

is the vector of single-station regression coefficients with the first element being the intercept; and

e_{i t}

is the error, where

E [e_{i t}^{S S}] = 0

. In LS regression, the parameters

β_{i}^{S S}

are estimated by determining the values

\hat{β_{i}^{S S}}

that minimize the squared error between the prediction

\hat{Y_{i t}} = X_{i t}^{1'} \hat{β_{i}^{S S}}

and observed peak-flow values

Y_{i t}

, i.e.,

\hat{β_{i}^{S S}} = \min_{β_{i}^{SS}} [{\sum_{t = 1}^{T_{i}} (\hat{Y_{i t}} - Y_{i t})}^{2}],

(2)

where

\min_{θ} [Z (θ)]

returns the value of

θ

that minimizes Z. The LS regression model estimates the conditional mean of the predictand

Y_{i t}

given the predictors

X_{i t}^{1'}

, i.e.,

E [Y_{i t} | X_{i t}^{1'}] = X_{i t}^{1'} β_{i}^{S S} .

(3)

The single-station QR model is like the least-squares model but is quantile (τ) dependent, as it estimates conditional quantiles rather than the conditional mean:

Y_{i t} = X_{i t}^{1'} β_{i}^{S S} (τ) + e_{i t}^{S S} (τ),

(4)

and

Q^{(τ)} [Y_{i t} | X_{i t}] = X_{i t}^{1'} β_{i}^{S S} (τ),

(5)

where

Q^{(τ)} [Y_{i t} | X_{i t}]

indicates the conditional quantile with probability

τ

; for example, in the present context,

τ = 0.90

indicates the flood with a non-exceedance probability of 0.90, that is, an annual exceedance probability (AEP) of 0.10, or a 10-year flood. Note that Equation (5) implies that the τth quantile of the errors

e_{i t}^{S S} (τ)

,

Q^{(τ)} [e_{i t}^{S S}]

, is zero, analogously to how Equation (3) implies that the errors

e_{i t}^{S S}

of the least-squares model have a zero mean.

The parameters

β_{i}^{S S} (τ)

of a QR model are determined by minimizing a quantile-dependent function of the difference between the prediction and the observations:

\hat{β_{i}^{S S}} (τ) = \min_{β_{i}^{SS}} [\sum_{t = 1}^{T_{i}} ρ_{τ} (Y_{i t} - \hat{Y_{i t}} (τ))],

(6)

where

\hat{Y_{i t}} (τ) = X_{i t}^{'} \hat{β_{i}^{S S}} (τ)

and

ρ_{τ} (u)

, the QR “check function,” is given by:

ρ_{τ} (u) = I (u > 0) τ |u| + I (u \leq 0) (1 - τ) |u|,

(7)

where

I (x)

is the indicator function, that is,

I (x) = 1

if x is true; otherwise

I (x) = 0

. From the definition of the check function, the QR minimization (Equation (6)) can be rewritten as:

\hat{β_{i}^{S S}} (τ) = \min_{β_{i}^{SS}} [\sum_{t = 1}^{T_{i}} I (u > 0) τ |u_{i t}| + \sum_{t = 1}^{T_{i}} I (u \leq 0) (1 - τ) |u_{i t}|]

(8)

Equation (8) shows that deviations

u_{i t} = Y_{i t} - \hat{Y_{i t}} (τ)

are unequally weighted in the summation: positive

u_{i t}

values are weighted by τ and negative values are weighted by

1 - τ

. As a result, for example, when τ is large, positive deviations have the larger weight, so having fewer of them reduces the sum, making it better to select parameters that make

\hat{Y_{i t}} (τ)

also relatively large. The opposite holds when τ is small. For

τ = 0.5

, deviations are equally weighted (the check function reduces to

ρ_{0.5} (u) = (1 / 2) |u|

), which shows that the QR solution reduces to the least absolute deviations, which provides the conditional median prediction.

Single-station quantile regressions were run for τ values from 0.04 to 0.96, so AEPs of 0.96 to 0.04 or 1.04 to 25-year floods. Attempting to compute regressions for more extreme τ values is not warranted because most stations have records too short to support their computation.

The single-station LS regressions were performed using the lm function from the R stats package [73]. The corresponding quantile regressions were performed using the rq function from the quantreg package [74] in R.

2.2. Panel–Quantile Regression

The MMQR approach of Machado and Santos Silva [72] for estimating conditional quantile coefficients from panel data was selected because it allows for distributional fixed effects and, in particular, was found to be superior to the pure location shift approach of Canay [71] in an analysis of peak-flow quantiles by Over et al. [27], which uses similar methods but is focused on the effects of urbanization. MMQR uses least-squares regression alone, without using quantile regression, to solve for the model parameters. The MMQR approach is based on the following location-scale panel model:

{f (Q_{i t}) = Y}_{i t} = a_{i}^{P} + X_{i t}^{'} β^{P} + (δ_{i} + X_{i t}^{'} γ) U_{i t}, i = 1, \dots, N and t = 1, \dots, T_{i},

(9)

where, for basin i at time t,

{f (Q_{i t}) = Y}_{i t}

is the dependent variable, where

f (Q_{i t})

is the peak-flow transformed by some monotonically increasing function

f (\cdot)

, e.g.,

l o g (\cdot)

;

a_{i}^{P} + X_{i t}^{'} β^{P}

is the location term; and

(δ_{i} + X_{i t}^{'} γ) U_{i t}

is the scale term, where

U_{i t}

is a realization of a mean-zero random variable. The location and scale terms are denoted as such because the mean of

Y_{i t}

is modeled by the location term and the variance of

Y_{i t}

by the scale term; the former holds because

U_{i t}

has a mean of zero, and the latter because the location term is non-random and thus it does not contribute to the variance. In the location term,

a_{i}^{P}

are the fixed effects;

X_{i t}^{'}

are, in general, explanatory variables, although

X_{i t}^{'}

here comprises a single climate variable, which is assumed to be strictly exogenous (thus independent of the error term) and to satisfy other technical conditions; and

β^{P}

is the panel model climate coefficient. The scale term generalizes the independent and identically distributed (i.i.d.) error term of the single-station regression models; in this model, the error

U_{i t}

is scaled by a positive scale factor

δ_{i} + X_{i t}^{'} γ

, which is the sum of a second fixed effect

δ_{i}

and the product of a second vector of coefficients

γ

and the explanatory variables

X_{i t}^{'}

. The error

U_{i t}

is an i.i.d. realization of a random variable U with density function

f_{U} (\cdot)

bounded away from zero (thus excluding discrete distributions) and distribution function

F_{U} (\cdot)

and satisfying the moment conditions

E (U) = 0

and

E (|U|) = 1

. Other than these conditions on U, its distribution is unspecified and is estimated from the data, so the MMQR approach is nonparametric with respect to probability distribution.

The quantities of interest are the conditional quantiles of the transformed peak flows

{f (Q_{i t}) = Y}_{i t}

(Equation (9)). These quantiles are given as [72]:

Q_{Y} (τ | X_{i t}) {= a}_{i}^{P} + δ_{i} q (τ) + X_{i t}^{'} (β^{P} + γ q (τ)),

(10)

where

q (τ) = F_{U}^{- 1} (τ)

is the quantile function of U. The coefficient of the explanatory variables in Equation (10),

β (τ) = β^{P} + γ q (τ),

(11)

is the regression quantile coefficient for the τth quantile

Q_{Y} (τ | X_{i t})

; it gives the response of the

τ

th quantile to the explanatory variables

X_{i t}^{'}

(here, a single climate variable). Notice that the regression quantile coefficients

β (τ)

depend on

τ

only through

γ q (τ)

; therefore, because

q (τ)

is monotonically increasing,

β (τ)

is also monotonic with the same sign as the scale coefficient

γ

. Therefore, if coefficient

γ

is positive, the response of the peak-flow quantiles to the climate variable increases with increasing

τ

and decreases if

γ

is negative. The term

a_{i}^{P} + δ_{i} q (τ)

is called the distributional fixed effect; this term is the sum of the location effect

a_{i}^{P}

, which can be considered the mean fixed effect because

\int_{0}^{1} q (τ) d τ = 0

, and

δ_{i} q (τ)

, which models the dependence of the fixed effect on

τ

. Notice further that the distributional fixed effect gives the quantiles when

X_{i t}^{'} = 0

, and when

X_{i t}^{'} = 0

is physically meaningful, it is expected that

δ_{i}

will be positive so that the distributional fixed effect will increase with

τ

.

The parameters of the location-scale panel model, except for

q (τ)

, are estimated by a pair of standard panel model “within” estimates [52] (ch.14) and [57] (ch.14). In the first, the parameters

β^{P}

and

a_{i}^{P}

of the location model are estimated by applying the within estimator to the peak-flow

Y_{i t}

and climate data

X_{i t}^{'}

as follows [72]:

Compute the estimate $\hat{β^{P}}$ of the location model coefficient $β^{P}$ by regressing the time-demeaned observations $Y_{i t} - \sum_{t} Y_{i t} / T_{i}$ , where $T_{i}$ is the number of observations at the ith basin, on the time-demeaned explanatory (climate) variables, $X_{i t} - \sum_{t} X_{i t} / T_{i}$ by least squares. Time-demeaning eliminates the location effects $a_{i}^{P}$ and the differences in means between the demeaned variables, so the regression only considers the variations “within” the data associated with each individual (i.e., station) i as they vary through time to compute $\hat{β^{P}}$ .
Compute the estimates of the location effects $a_{i}^{P}$ as $\hat{a_{i}^{P}} = (1 / T_{i}) \sum_{t} (Y_{i t} - X_{i t}^{'} \hat{β^{P}})$ , and define the residuals ${\hat{R}}_{i t} = Y_{i t} - \hat{a_{i}^{P}} - X_{i t}^{'} \hat{β^{P}}$ , which estimate $(δ_{i} + X_{i t}^{'} γ) U_{i t}$ , according to Equation (9).

In the second, the parameters of the scale term, the coefficients

γ

, and the fixed effects

δ_{i}

are estimated by applying the within estimator to the absolute values

|{\hat{R}}_{i t}|

of the residuals of the location model as a function of the climate data

X_{i t}^{'}

as follows:

3.: Compute the estimate $\hat{γ}$ of the scale model coefficient $γ$ by regressing the time-demeaned absolute residuals $|{\hat{R}}_{i t}| - \sum_{t} |{\hat{R}}_{i t}| / T_{i}$ on the time-demeaned explanatory variables $X_{i t} - \sum_{t} X_{i t} / T_{i}$ by least squares.
4.: Compute the estimates $\hat{δ_{i}}$ of the scale effects $δ_{i}$ as $\hat{δ_{i}} = (1 / T_{i}) \sum_{t} (|{\hat{R}}_{i t}| - X_{i t}^{'} \hat{β^{P}})$ .

The final step is to estimate the quantile function of U,

q (τ)

, which is achieved by using the quantiles of the U estimate

{\hat{U}}_{i t} = {\hat{R}}_{i t} / (\hat{δ_{i}} + X_{i t}^{'} \hat{γ}) .

(12)

As mentioned, quantile regression does not appear in the estimation procedure; instead, as befits the “quantiles via moments” moniker and the location-scale model setup, the parameters are estimated by a pair of conditional mean panel regressions and from the quantiles of

{\hat{U}}_{i t}

.

For a dataset with a finite number

T_{i}

of observations per station, the estimated quantile regression coefficients

\hat{β} (τ)

are biased because of biases in

\hat{γ}

and

\hat{q} (τ)

having leading terms proportional to

1 / T_{i}

[72]. Simulations in Machado and Santos Silva [72] showed that the split-panel jackknife bias correction of Dhaene and Jochmans [75] was effective in removing much of the bias; that correction was applied in this analysis as well.

Quantile regression coefficient standard errors were computed using bootstrapping, sampling the stations with replacement 100 times. The use of a larger number of samples was tested but the differences were very small.

The panel regression coefficient estimates were computed using the function plm in the R package plm [76] and their robust, heteroscedasticity-consistent standard errors using the vcovHC function from the plm package [77] using the method of Arellano [78], which also makes the estimates robust with respect to intra-station serial correlation. The code for other computations related to the MMQR method was written by the authors in R [73,79].

2.3. Goodness-of-Fit Statistics

To compare different least-squares models of the same method (that is, different single-station models or different panel models), adjusted R², denoted here as

R_{a d j}^{2}

, is used.

R_{a d j}^{2}

is an adjustment of the coefficient of determination R² for the number of parameters used and is defined as follows [80] (ch.11):

R_{a d j}^{2} = 1 - \frac{(n - 1) S S E}{(n - p) {S S}_{y}} = 1 - \frac{M S E}{{S S}_{y} / (n - 1)} = 1 - (1 - R^{2}) \frac{n - 1}{n - p},

(13)

where n is the number of observations, p is the number of estimated parameters, SSE is the error sum of squares (the sum of squared residuals), SS_y is the total sum of squares, i.e., the squared differences in the observations from their mean,

M S E = S S E / (n - p)

is the mean square error, and

R^{2} = 1 - S S E / {S S}_{y}

is the coefficient of determination. For the single-station regressions, just two coefficients are estimated, so

p = 2

. For the first-stage panel regression, which is used to estimate the coefficient

β^{P}

and the location effects

a_{i}^{P}

,

p = 1 + N

, where

N = 330

is the number of basins in the panel and also the number of location effects that are estimated. For comparison, for the single-station regressions, the number of observations n varies from 38 to 98 (Section 3.1 below), and for the panel regressions,

n = 21,377

, which is the number of observations summed across all the stations.

To compare the quantile regression results for the same model from different approaches, two statistics are used. One is the properties of the weighted quantile regression deviations

ρ_{τ} (y_{i} - X_{i}^{'} \hat{β})

(where i indexes observations, i.e., individual peak flows), which is similar to the loss function used by Ouali et al. [60]. These deviations are the absolute residuals weighted according to the quantile regression check function (Equation (7)), and it is their sum that is minimized when standard quantile regression is performed (Equation (6)). As discussed by Ouali et al. [60], because this loss function depends only on the observed peak flows

y_{i}

, its use avoids the need for at-site estimates of the quantiles, which are themselves uncertain because they typically involve the assumption of a fitting distribution, in addition to sampling error.

One connection that can be made between the weighted quantile regression deviations and more common goodness-of-fit statistics is for

τ = 0.5

, in which case the check function

ρ_{0.5} (u)

(Equation (7)) reduces to

ρ_{0.5} (u) = (1 / 2) |u|

), and thus

ρ_{τ} (y_{i} - X_{i}^{'} \hat{β})

becomes

(1 / 2) |y_{i} - X_{i}^{'} \hat{β}|

, i.e., half the absolute model residuals. Therefore, for

τ = 0.5

, the weighted quantile residual deviations, after multiplying by 2, can be used to estimate measures of absolute error, such as typical values or the mean.

The values taken by the weighted quantile regression deviations vary among the methods because each method has its own constraints. The single-station quantile regressions are less constrained than those from the MMQR method, i.e., they have more fitted parameters, as a separate set of regressions is fit for each station. In the MMQR approach, the number of estimated parameters is much reduced because instead of an independent slope for each quantile, the coefficients are constrained to a certain functional form, i.e.,

β^{P} + γ q (τ)

, which varies with τ only according to

q (τ)

, the quantile function of the error term U. In addition, instead of one intercept per quantile, the MMQR approach has the distributional fixed effects

a_{i}^{P} + δ_{i} q (τ)

, which adds just two more parameters per station.

Another property that indicates the appropriateness of a quantile regression model is the presence of crossing of fitted quantile surfaces with different τ values; these surfaces are simply lines for the one-predictor models being considered here. It is a consequence of the constraints inherent in the MMQR approach that the conditional quantiles do not cross for any fitted value

{\hat{y}}_{i} (τ) = {\hat{a}}_{i}^{P} + {\hat{δ}}_{i} \hat{q} (τ) + X_{i t}^{'} ({\hat{β}}^{P} + \hat{γ} \hat{q} (τ))

such that the scale

{\hat{δ}}_{i} + X_{i t}^{'} \hat{γ}

is positive [72], which is equivalent to

{\hat{y}}_{i} (τ)

monotonically increasing in

τ

. No such constraint exists for the single-station quantile-regression approach adopted here, although such formulations of quantile regression are available [81,82]. Therefore, the fraction of crossing quantiles is the other statistic used here to compare the different approaches for the same model. This fraction was assessed by computing, for each observation

i

and quantile

τ_{j}

, how often the magnitudes of neighboring pairs of fitted values are out of order, i.e., if

y_{i} (τ_{j - 1}) > y_{i} (τ_{j})

, for example, if

y_{i} (0.04) > y_{i} (0.10)

. Notice that with this method, crossing never occurs for the smallest

τ

value. This method also ignores whether crossing occurs between non-neighboring quantiles, which might be an indicator of the severity of the non-monotonicity, but it seems unnecessarily complicated for an initial analysis.

2.4. Computation of Temporal Trends

The properties of the temporal trends in the annual maximum peak flows and in the predictors are presented to provide context regarding the trends to be explained by the regressions. Trends were analyzed with the Mann–Kendall test [80] (ch.12) and [83]. In this test, Kendall’s tau correlation is computed for the variable of interest and time, as first suggested by Mann [83]. Because Kendall’s tau correlation is rank-based and thus nonparametric, the results of the Mann–Kendall test are invariant with respect to monotonic but possibly nonlinear transformations, and it does not require normality of the residuals to test hypotheses; as such, it is also relatively resistant to outliers. The Kendall’s tau correlations from the Mann–Kendall test are, however, sensitive to serial correlation [80] (ch.12). Computation of the serial correlation of the log-transformed peak-flow series showed that 8.5% of the series have significant (α = 0.05 for a two-sided test) lag-one serial correlation, so serial correlation is sometimes present but is uncommon, and so it was ignored in this analysis. The result is that the true p-values for the Kendall tau values in the Mann–Kendall trend analyses for the affected stations are somewhat larger than the reported values.

2.5. Station Selection and Peak-Flow Data

The study region comprises the nine-state region of Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin, and selected neighboring areas in the midwestern United States (Figure 2). This region is being studied in a comprehensive peak-flow trends study [84] of which this study is a part. The first set of results from the study [65] considered trend periods of 100, 75, 50, and 30 years, all ending in 2020, and documented changes in peak-flow magnitude and timing and several climate variables including observed precipitation and temperature and modeled soil moisture, snowfall, snowpack, evapotranspiration, and runoff [85]. The observed changes vary spatially within the study region, with peak-flow trends generally being positive to the southeast for all trend periods but more complex patterns in other parts of the region. For example, during the 50-year trend period, peak-flow trends are mostly negative in the western half of the study region, as well as in the northeast, but they are positive in the north-central region (Red River of the North and surroundings), a region that experienced multiple major floods during that period [86]. Very broadly, the peak-flow trends agree with trends in precipitation and modeled runoff, but there is substantial local disagreement between the peak-flow and climate trends, which likely arises from non-climatic factors such as changes in land use and agricultural practices.

To select basins for use in this analysis, gauged basins from Ryberg et al. [85] that were present in at least one of the 100-, 75-, or 50-year trend periods ending in water year 2020 were identified, totaling 590 such basins. In selecting their study basins, Ryberg et al. [85] first removed peaks with certain qualification codes from the peak-flow records for candidate basins, including those affected by dam failure or by regulation or diversion and those not from the systematic record. Then, they applied a record completeness test, and, finally, removed basins whose flow is substantially regulated by reservoirs [87]. For this study, additional filters were applied: (1) basins crossing the Canadian border were removed because of climate and land cover data availability; (2) basins with greater than 4% imperviousness in 2019 according to the 2019 National Land Cover Database [88,89] were removed; and (3) basins not present in the GAGES-II dataset [90] were also removed, to ensure that streamgages had daily discharge data and because GAGES-II provides a set of precomputed basin characteristics, which were used to explore the dependence of the fixed effects on basin properties, as described in the Results section. Applying the filters discussed above reduced the initial 590 basins to 404. The limitation of basins to have 4% imperviousness at most was selected to ensure basins in this study were non-overlapping with those in a previous study of the effects of urbanization and climate using similar methods [27]. For reference, according to the selected model in that study, a change in imperviousness of 4% corresponds to a 5% increase in the peak-flow quantile with an AEP of 0.5, with the effect of imperviousness decreasing with decreasing AEP, becoming insignificantly (α = 0.05) different from zero for AEPs of 0.04 and smaller.

A final filter was applied to address the redundancy arising from the nesting of gauged basins within one or more other gauged basins. When these are relatively close in drainage area, they are statistically redundant [91]. To address this issue, a redundancy analysis was implemented to establish a non-redundant subset of basins appropriate for panel regression analysis. The algorithm for determining how to remove redundant basins optimally, developed by Over et al. [26], was applied here. The algorithm assigns a score to each basin according to its record length and drainage–basin properties relative to other basins. The score a basin receives is higher the more drainage-area overlap it has with other basins, the shorter its record length, and the more similar its drainage area is compared to that of other basins in the collection of basins being considered. Of the 404 remaining basins, 74 were determined to be redundant. Regression analysis was applied to the remaining 330 non-redundant basins.

Peak-flow data for each streamgage were obtained from Marti et al. [92]. Because the climate data used in this study are only available up to 2018, the periods used in this study also end at water year (WY) 2018, where a water year begins on October 1, ends on September 30, and is named for the calendar year in which it ends. As of result of this limitation on the climate data, the longest datasets run from WY 1921 to WY 2018, a total of 98 years. To address the presence of zeroes in the peak-flow data when log-transforming them, for those stations with zeroes, one-tenth of the smallest positive peak-flow value was added to all peak-flow values for that station before use in the regression analysis. A total of six zeroes are in the study peak-flow dataset; these occur at three stations, all of which are located in the Prairie Pothole Region [93] in eastern North and South Dakota.

The selected stations are distributed across the study area, although with some areas of lower density, such as in central Minnesota and eastern Montana (Figure 2). The number of years of peak-flow record for the selected streamgages varies from 38 to 98 years, with a median of 69 years, and the basin properties vary widely, with drainage areas ranging from about 28 to about 38,700 square kilometers (km²), mean annual precipitation (MAP) from 317 to 1755 millimeters (mm), aridity (the ratio of mean annual potential evapotranspiration to MAP) from about 0.25 to about 2, and mean basin elevation from 145 to 3137 meter (m) (Table 1). Generally, the study region becomes progressively drier (indicated by both less precipitation and greater aridity) going from the east to the west, until the mountains of western Montana are reached. Similarly, basin elevation gradually increases from the east to the west until large increases in the western Montana mountains. The wide range the basin properties will help to make the fitted model coefficients robust and thus applicable to a wide range of basins throughout the study region.

2.6. Computation of the Climate Predictors

Daily precipitation and temperature data from 1915 to 2018 (the extent of the dataset) from Pierce et al. [94], which is gridded from point (gage) observations, were averaged by basin for use in the regression analysis. The precipitation data from Pierce et al. [94], which is a revised and updated version of the precipitation data from Livneh et al. [95], preserves extremes in the underlying daily gage data that are lost when a time adjustment is applied to put the data on a uniform timeframe, as is often done in developing gridded datasets. Time series of two precipitation-based climate variables were computed from the basin-mean precipitation data: (1) ann_max_prcp, the annual maximum daily precipitation in mm; and (2) ann_tot_prcp, the annual total precipitation in mm. Discharge-based climate time series were computed using a version of the simple water-balance model MWBM [96,97] run at a daily time step in a spatially lumped manner. Inputs to MWBM comprise daily total precipitation and mean temperature, the latitude of the basin’s streamgage, and six adjustable parameters (Table 2). Despite being a simple model with a single soil storage variable, MWBM simulates the main elements of runoff generation: direct runoff, infiltration, saturation excess runoff, soil storage, snowpack development and melt, and evapotranspiration, which is controlled by soil moisture and potential evapotranspiration. As such, it simulates the combined and nonlinear effects of variations in precipitation and temperature on floods, such as changes in antecedent moisture and snowpack that would be hard to model by using temperature and precipitation directly in a regression framework. Because of its simplicity, MWBM is used here as an index of the climate in the regression modeling of the annual peaks rather than as a model that is expected to predict peaks per se without an additional model layer, such as the regression models used here.

For use in the calibration of the adjustable parameters, daily observed discharge data were retrieved from the U.S. Geological Survey’s National Water Information System database [98]. Calibration of the adjustable MWBM parameters was performed for each basin separately using a particle-swarm optimization (PSO) algorithm from the hydroPSO R package [99] whenever daily discharge data were available during the period from water years 1915 to 2018 (inclusive), with a 365-day spin-up period. In the calibration, 40 particles were used with 50 iterations at most, and thus 40 sets of calibrated MWBM parameters were produced per basin, from which the parameter sets with the best (highest) Kling–Gupta Efficiency (KGE; [100,101]) values were selected. The selected variables from the calibrated MWBM model output for each water year with a valid peak-flow value comprise the discharge-based climate predictors considered in this study, of which there are two: (1) ann_max_Q, the annual maximum of the daily MWBM discharge values in mm; and (2) ann_totQ, the annual total MWBM discharge in mm. The primary results in this study use the MWBM results based on the individually calibrated parameters (termed “at-site calibration”), but the MWBM results based on using the median of the at-site calibrated parameters (termed “median parameters”) are also considered, as a proxy for the performance of MWBM at ungauged basins in the study region.

3. Results

3.1. Temporal Trends

Statistics of the temporal trends in peak flow and in the selected climate variables are given in Figure 3 and Table 3, and peak-flow and ann_max_Q trends are mapped in Figure 2. These statistics show that although a majority (57%) of the peak-flow trends are positive, only about 16% are significantly positive (

p \leq 0.05

). The moderately positive trends of the peak flows contrast with the more common, stronger, and more significantly positive trends in the climate variables, especially in ann_tot_Q and ann_tot_prcp, which are both about 90% positive and 40% significantly positive; the trends in ann_max_Q, being about 85% positive and 35% significantly positive, are not far behind the two annual total climate series. Basins with positive peak-flow trends are found in the southeast and central parts of the study area; the remainder of the study region, i.e., the northeast and western parts, has mostly negative peak-flow trends (Figure 2A). Trends in ann_max_Q are correlated with the peak-flow trends but are more positive, with stronger positive trends in the southeast and central parts of the study region and weaker positive trends or negative trends where the peak-flow trends are negative (Figure 2B). These contrasting trend results indicate that, outside the central and southeast parts of the study region, there are other non-climatic factors affecting the peak-flow trends, despite having filtered the study basins for the effects of reservoir regulation and urbanization, or that the MWBM model is not capturing changes in runoff generation processes, or both.

The correlations among the trends (Table 4, Figure S1) indicate that ann_max_Q is the most likely to best explain the peak trends. However, the trends in climate variables are more correlated amongst themselves than they are correlated with the peak trends.

3.2. Goodness of Fit and the Selection of Climate Predictors and Regression Transformations

Adjusted R² values from the least-squares single-station and location-model panel regressions (Figure 4, Table S1) provide the data needed to select the climate predictor variable and the peak-flow and climate transformations, and to test the effect of using median MWBM parameters rather than those obtained from at-site calibration. These results indicate that ann_max_Q is the best climate predictor for both single-station and panel regressions when square-root-transformed, whereas for no or log-transformation, although the distribution of single-station adjusted R² values decreases only slightly, the panel regression value drops substantially, especially for the log-transformation. The results also indicate that square-root transformation of the peaks is somewhat better than log transformation, especially for the panel regressions; however, the results using log-transformed peaks are the primary focus of the remainder of this study because such results are more easily interpreted. The ann_max_prcp climate variable gives the worst fit, presumably because a large fraction of peak flows in the study region did not occur directly in response to precipitation (instead being driven by snowmelt), in addition to its failure to characterize antecedent moisture for those events that are directly precipitation driven. Between the annual total climate predictors, ann_tot_Q and ann_tot_prcp, ann_tot_Q is generally better for single-station regressions, but their results are very similar for panel regression.

Two of the boxplots in Figure 4 present adjusted R² values based on MWBM runs that used the median parameter values (Table 2). The results from these parameters are included to provide an estimate of the results that would be obtained for MWBM predictions in ungauged basins in the study region. These results show that little explanatory power is lost by using the median parameter values, as can be seen by comparing the median MWBM results with the at-site results for the same climate variable and climate-variable and peak-flow transformations.

3.3. Regression Coefficients and Fixed Effects

Regression coefficients from the selected log(peak)~sqrt(ann_max_Q) model (using at-site calibrated MWBM discharge values) are presented in Table 5 and Figure 5. For both the least-squares (lm/plm) and the quantile regressions, the coefficient values show approximate agreement between the MMQR coefficients and the central tendencies of the single-station coefficients, and the least-squares (lm/plm) results agree approximately with the median (

τ = 0.5

) quantile regression results, which shows that the single-station and panel regression results agree on average. For both the single-station and MMQR regressions, the quantile regression coefficients decrease with increasing

τ

(decreasing AEP). Given the typical standard error (SE) of the single-station coefficients (having a median of about 0.2 for a median coefficient of 0.56 to 0.91), this decrease may be of borderline significance for most of the single-station regressions, but it is highly significant for the MMQR (panel) regressions, whose SEs are much smaller than the differences between the coefficients. Because the quantile regression coefficients are given as

β (τ) = β^{P} + γ q (τ)

(Equation (11)) for the MMQR model, the significance of their variation with

τ

can likewise be deduced from the significance of the estimate of

γ

, which is quite high, as its magnitude is about 12 times its standard error (Table 5). This result shows that for these variables with the selected transformations, the MMQR model will simulate the dependence of peak-flow quantiles on climate more accurately than a reduced model without the dependence of scale on climate (i.e., setting

γ = 0

in Equations (9)–(11)). We note, however, that the quantile regression coefficients from the sqrt(peak)~sqrt(ann_max_Q) model vary with

τ

in the opposite direction, i.e., they increase with increasing

τ

(Table S2, Figure S2). This result indicates that there may be a combination of transformations for which the quantile regression coefficients do not vary significantly with

τ

and thus where

γ

is not significantly different from zero.

Almost all (98.5%) of the single-station least-squares coefficients are significantly different from zero at

p < 0.05

, as are about 66% to 95% of the single-station quantile regression coefficients, with smaller fractions the farther the

τ

value is from the median (Table 5, column “Single-station coeff.v0_SE_frac2”). Similarly, about half of the single-station regression coefficients are significantly different from the MMQR regression coefficients, also at

p < 0.05

(Table 5, column “Single-station coeff.vMMQR_SE_frac2”). The latter set of fractions is one measure of how often the single-station regressions give additional information compared to the regional MMQR results.

The MMQR scale coefficient

\hat{γ}

was estimated to be negative and significantly different from zero (Table 5). Because the quantile regression coefficients for the MMQR model are given as

\hat{β} (τ) = \hat{β^{P}} + \hat{γ} \hat{q} (τ)

(Equation (11)), it is the negative value of

\hat{γ}

that determines that the quantile regression coefficients decrease with increasing

τ

. Further, because

\hat{q} (τ)

is approximately symmetric around

τ = 0.5

, so are the quantile regression coefficients.

As discussed in Section 2.2, the MMQR method estimates two fixed effects for each station: the location effect

a_{i}^{P}

and the scale effect

δ_{i}

. It is necessary to review these estimates to determine whether the scale effects are significant, and whether the effects vary by basin, and if so, how. Figure 6 shows that the scale effects are significantly different from zero (because all values exceed twice their standard errors), and further, both effects vary among the basins, because the differences between the values far exceed their standard errors.

To obtain some hydrologic understanding of how the scale and location effects vary and to demonstrate how they could be estimated at ungauged basins, plots of their values as a function of selected static basin characteristics taken from the GAGES-II dataset [90] are provided (Figure 7). The basins were split into two groups at the -98th meridian (which runs through eastern North and South Dakota; refer to Figure 2) to simplify the relationships. For both groups, the location effect varies strongly and positively and approximately linearly with the log of drainage area, which is as expected, because the location or central tendency of flood peaks is expected to increase with drainage area. The scale effect, which is an indicator of the spread of the distribution of the log-transformed peaks, was found to vary most strongly with longitude east of the -98th meridian, decreasing when moving east; a wider peak-flow distribution is not unexpected when moving west in the study region because in doing so, the balance between evaporative demand (potential evapotranspiration, PET) and precipitation moves from precipitation being much greater than PET to a more balanced state [102,103]. West of the -98th meridian, the scale effect was found to vary most strongly with mean basin elevation and decreases with increasing elevation to about 2500 m. This decrease in spread with elevation indicates that higher basins, which have more snow and steeper slopes, have narrower (less variable) peak-flow distributions, which is also as expected.

The relations between the fixed effects and static basin characteristics (Figure 7) indicate that MMQR models could be developed for ungauged basins in the study region, although with some uncertainty, especially regarding the scale effect. To demonstrate the effect of the resulting uncertainty on peak-flow quantile predictions, a leave-one-out cross-validation (LOOCV) analysis was performed. In this analysis, each study basin in turn was assumed to be ungauged and left out of the model fitting. Using the remaining basins, new MMQR coefficients (Section 2.2) were computed using climate variable values with MWBM simulations using median rather than at-site calibrated parameters (Section 2.6), and likewise, new relationships between basin characteristics and fixed effects like those in Figure 7 were computed. The basin characteristic–fixed effect relations were used to predict the fixed effect for the left-out basin (Figure S3). With these fixed-effect estimates and the new MMQR coefficients, quantiles were predicted at the left-out basin. These predicted quantiles were evaluated using the same goodness-of-fit statistics as the primary results (Section 2.3).

3.4. Goodness of Fit and Comparison of Modeling Approaches

Two goodness-of-fit measures were proposed to compare the modeling approaches. One is the properties of the quantile regression deviations

ρ_{τ} (y_{i} - X_{i}^{'} \hat{β})

(where i indexes individual peak flows) that are minimized in standard quantile regression. As expected, because they have much more flexibility in fitting, the single-station quantile regressions have smaller deviations than the MMQR regressions throughout, except for the upper extremes of the distribution of values for

τ = 0.25

and

τ = 0.5

(Table 6). The as-if-ungauged (MMQR-LOOCV) deviation statistic values are usually around twice the corresponding MMQR values, which mostly reflects the error in the estimated fixed effects (Figure S3), as the estimates of the MMQR regression coefficients

β (τ)

are insensitive to dropping a single gauge.

As discussed in Section 2.3, the quantile regression deviations for

τ = 0.5

turn out to be half the absolute errors. Thus, focusing on the Median rows in Table 6 for

τ = 0.5

, the typical absolute errors for the single-station, MMQR, and MMQR-LOOCV results are about 0.28, 0.31, and 0.66, respectively. To make those values more meaningful, because the peaks are log-transformed in the models, exponentiating the errors gives the typical error ratios for

τ = 0.5

, which are about 1.32, 1.36, and 1.93, respectively. These values indicate, for example, that for

τ = 0.5

, for a gauged basin in the study using the MMQR method, the model typically predicts a peak-flow quantile about 1.36 times as big as the observed or the reciprocal, i.e., a peak-flow quantile about 1/1.36 = 0.73 as large as the observed.

The other proposed goodness-of-fit measure is the prevalence of quantile crossing (equivalent to the fraction of fitted non-monotonic dependent variable values for a given design point—Section 2.3). As expected, because of their additional flexibility in fitting, the single-station quantile regression lines cross more often than those of the MMQR results (Table 7; refer to Figure S5 panel C for an example), but the MMQR and MMQR-LOOCV lines do occasionally cross. In the single-station results, the prevalence of crossing depends on the quantile, being about 2% at the extremes and decreasing to about 0.4% at the center, and at worst, at one or more observations, only 2 of 7 (~0.286) predicted values do not cross (counting the

τ = 0.04

case that cannot cross), but on average, about 99% of observations have no crossing. The MMQR and MMQR-LOOCV lines cross about 0.1% of the time independently of the quantile, but at worst, at one or more observations, only 1 of 11 (~0.0909) predicted values do not cross (counting the

τ = 0.002

case that cannot cross), although on average, about 99.9% of observations have no crossing. From reviewing the results in detail, for the few stations that have an especially large maximum sqrt_ann_max_Q value associated with one observation, the MMQR regression lines all converge at a point at a sqrt_ann_max_Q value that is smaller than the especially large sqrt_ann_max_Q value that happens to be present; as result, for that largest sqrt_ann_max_Q value, monotonicity fails for all the quantiles. This type of situation explains at least some of the observations where only 1 of 11 predicted values do not cross (counting the

τ = 0.002

case).

As mentioned in Section 2.3, in the MMQR method, for “design points”

X_{i t}^{'}

where the scale values

{\hat{δ}}_{i} + X_{i t}^{'} \hat{γ}

are positive, crossing is prevented by construction. Thus, the appearance of crossing for the method implies that some of the scale values are negative, and in fact, 5 of the scale values, of which there is one per observation and therefore 21,377 in total, are negative. Because in the MMQR method, theoretically, the scale cannot be negative, the occurrence of a few negative scale values and the occasional crossing of quantiles indicates occasional failure of the fit. This failure is like that of the crossing of the single-station regression lines, which is also theoretically impossible. Thus, overall, the monotonicity results favor the MMQR method.

Of note, although their fits as measured by adjusted R² are not as good as when using sqrt_ann_max_Q, the other climate variables (ann_tot_Q, ann_tot_prcp, and sqrt_ann_max_prcp) when using with the MMQR method (and log-transformed peaks) have no negative scale values and thus no crossing of quantiles for that method. Similarly, when using square-root-transformed peaks and the same set of climate variables, all MMQR scale values are positive for all climate variables, including sqrt_ann_max_Q, and thus, no crossing of quantiles for the MMQR method.

3.5. Results Summary

About 57% of the observed peak-flow trends are positive, whereas the fractions of positive trends in the considered climate variables range from about 72% to about 91%. This difference indicates the effects of additional factors driving trends in peak flows other than climate in the study basins, even though they have been filtered for the effects of urbanization and reservoir regulation. In the regression modeling, based on adjusted R² values, square-root-transformed annual maximum daily simulated discharge (ann_max_Q) was determined to be the best climate predictor; log-transformation of the peak flows was slightly better in the single-station regressions than square-root transformation, whereas the square root was better for panel regressions, but log-transformation was selected for the main results given its ease of interpretation. The associated quantile regression coefficients for the selected model were positive, as expected, and generally decreased with increasing

τ

(non-exceedance probability), with the differences in coefficients being marginally significant in single-station regression but highly significant for the regional (panel) model. This decrease in coefficients indicates decreasing sensitivity to climate with increasing

τ

, but the result is specific to the selected model, and the sign of the variation of coefficients with

τ

is transformation dependent. The panel model location and scale effects were found to have strong relationships with morphometric basin characteristics, indicating a means of estimating them for ungauged basins, and based on these relationships, quantiles for the study basins were predicted using an LOOCV approach. Based on quantile regression-specific goodness-of-fit statistics, weighted quantile deviations and monotonicity fractions, single-station regressions were usually found to have smaller weighted quantile deviations and monotonicity fractions than the panel regression results, which occurs because they use more degrees of freedom in model fitting. Tradeoffs between single-station and regional (panel) modeling are presented in the Discussion and Conclusions section.

4. Applications

4.1. General Considerations

Several types of applications of the models presented here, the fitted single-station and panel regression models, can be envisioned. To begin, consider how to parameterize the models for different locations, of which there are three types: (1) locations of interest that are gauged and used in this study (or any such study); (2) locations that are gauged but not used in this study; and (3) ungauged locations. All these locations are assumed to be within the study area and not substantially affected by reservoir regulation or urbanization. For any location of interest, three types of information are needed: (1) MWBM model runs to provide the climate time series; (2) regression coefficients that are applied to the climate values; and (3) the single-station intercepts (one per quantile) or the MMQR fixed effects (location and scale). For locations of the first type, those that were used in the study, all the needed data are already available. For locations of the second type, those which are gauged but not used in this study, single-station regressions could be computed given climate data from MWBM model runs. Such model runs could be performed with little loss of accuracy relative to the at-station calibrated values using the median parameters (refer to the discussion of Table 5 in Section 3.3). The MWBM parameters could also be estimated using at-station calibration if the station had sufficient daily discharge data. For the MMQR results at such locations, in addition to MWBM model runs, the quantile regression coefficients are already available, but fixed effects are needed. Those could be computed at gauged basins by applying steps 2 and 4 of the MMQR estimation procedure presented in Section 2.2 without recomputing the coefficients (steps 1 and 3). For locations of the third type, those that are ungauged, climate data could be simulated using the MWBM model with median parameters, but in this case, single-station regression is not possible. For the MMQR model, the coefficients are available, but for some applications, the fixed effects are needed. As described in Section 3.3, the fixed effects were found to be dependent on certain static basin characteristics (Figure 7), and thus they could be estimated (and were, in an LOOCV analysis) from those characteristics, although with some uncertainty (e.g., Figure S3).

One simple application that could be performed at any of the three types of locations is a kind of sensitivity analysis (a kind of “change factor” analysis [33,34]) to estimate the effect of changes in climate variable values in the future, which requires only the coefficients. It is important to note that this application assumes that the coefficients, which reflect the relationship between peak-flow and climate variables, remain unchanged. For a gauged location, either the single-station or panel regression coefficients could be used, if the desired τ value is available for the single-station regressions, but for simplicity, we assume the location is ungauged and thus the panel regression coefficients will be used. The MMQR regression coefficients given in Table 6 indicate the fitted response of log-transformed peak-flow quantiles to changes in the square-root-transformed simulated annual maximum discharge (ann_max_Q) in mm for different quantiles. If the location of interest is ungauged or otherwise not used in this study, to make an educated guess as to reasonable changes in this climate variable would require running MWBM for the historical period to determine a historical range of values of this variable. These values fluctuate from year to year, but if based on a fitted trend (assume, for example, that ann_max_Q has risen from 4 mm to 6 mm over the last 100 years), potentially, it could rise another 2 mm during the project’s design life. Given the present design quantile estimate, perhaps from a spatial regression equation, the relative increase in the quantile estimate given such an increase in annual maximum discharge would be computed as:

Q_{i, f u t u r e} (τ) / Q_{i, p r e s e n t} (τ) = e^{β_{X} (τ) [X_{i, f u t u r e} - X_{i, p r e s e n t}]},

(14)

for some basin i, where

Q_{i, p r e s e n t} (τ)

and

Q_{i, f u t u r e} (τ)

are the present and future peak-flow quantiles, respectively, with

A E P = 1 - τ

,

β_{X} (τ)

is the MMQR regression coefficient for climate variable X, assumed here to be

{a n n_m a x_Q}^{1 / 2}

, at

A E P = 1 - τ

(Table 5), and

X_{i, p r e s e n t}

and

X_{i, f u t u r e}

are the present and future values, respectively, of the climate variable. Equation (14) follows from Equations (10) and (11) when the peak flows used in fitting the coefficients

β_{X} (τ)

are log-transformed, and it implicitly assumes that future conditions do not induce a change in the relationship between the climate variable and the peaks by the coefficients. Even simpler, although less accurate, as indicated by the uncertainty of the MMQR coefficient and because it does not explicitly account for the effects of temperature changes, would be to use annual total precipitation (ann_tot_prcp) as the climate variable (Table S2). Using this climate variable would be simpler because it is likely that historical values could be obtained from external sources and no MWBM runs would be required.

4.2. Estimation via Adjustment

4.2.1. Method and Study-Wide Results

For a detailed application, it is convenient to select an application that uses only the historical data used to fit the models because it avoids the use of an additional dataset of the projected climate. In the context of historically nonstationary climate and without information on the future climate, the quantities that may be the most useful for design may be the peak-flow quantiles at the most recent year for which data are available (in this study, 2018). The use of such end-of-period estimates were previously considered by Glas et al. [39], Luke et al. [42], and Over et al. [26], among others. The fitted regression models predict conditional quantiles given a climate variable value. Therefore, a set of conditional quantiles is already available for 2018, but those quantiles are conditional on a certain realization of the climate variable. What is needed instead is an estimate of the unconditional quantiles for 2018, i.e., quantiles corresponding to the estimated distribution of climate values in 2018, rather than the observed 2018 climate value (or any single climate value). Such unconditional quantiles can be estimated by adjusting the observed peak flows to 2018 long-term climate conditions using their estimated exceedance probabilities, the associated quantile regression coefficients, and the change in climate values from their year of occurrence to 2018. This adjustment process is a generalized version of the approach used by Over et al. [26] to adjust peaks for the quantile-dependent effects of urbanization, and a proof of its validity is given in Appendix A. Because this process involves adjusting observed peaks, it cannot be used for ungauged basins, but a similar process using peak-flow quantiles conditioned on adjusted climate values can be used to estimate the end-of-period peak-flow quantiles for an ungauged basin; a description of that application is beyond the scope of this paper.

The stations

i = 1,2, \dots, N

were considered one at a time, and the peaks were adjusted as follows:

Fit a smooth trendline to the historical climate values, yielding the time series ${\bar{X}}_{i, t}, t = 1,2, \dots, T$ , where t indexes the year. Here, loess [104], implemented using the loess function from the R stats package [73] with a span value set to 1.0, was used for smoothing.
Because the climate coefficients depend on $τ$ , the next step is to estimate the $τ$ value associated with each observed peak. This was achieved by interpolating among the conditional quantiles (Equation (5) for the single-station regressions and Equation (10) for the MMQR model); this interpolation was performed linearly in the space of log-transformed peaks versus the unit Gaussian quantile function with $τ$ as the argument (equivalent to a set of straight-line segments on lognormal probability paper).
Given the estimated $τ$ value $\hat{τ}$ for each peak, estimate the coefficient value $\hat{b} (\hat{τ})$ by interpolation among the fitted quantile coefficients and their $τ$ values, which are given by $\hat{β_{i}^{S S}} (τ)$ (Equations (5) and (7)) for the single-station regressions and $\hat{β} (τ)$ for the MMQR model (Equation (11)). This interpolation was performed linearly. Both interpolations are limited by the range of $τ$ values, which are from $τ = 0.002$ to $τ = 0.998$ for the MMQR regressions, which is sufficient for all but the most extreme peaks, and from $τ = 0.04$ to $τ = 0.96$ for the single-station regressions. When the estimated $τ$ value was beyond the most extreme value for which a coefficient estimate was available, the value at the nearest extreme was used.
Given the coefficient value $\hat{b} (\hat{τ})$ , the adjusted peak was computed as:

${l o g Q}_{i t, T} = {l o g Q}_{i t} + {\hat{b}}_{i t} (\hat{τ}) [{\bar{X}}_{i, T} - {\bar{X}}_{i, t}]$

(15)

where ${l o g Q}_{i t}$ is the log of the observed peak-flow value for year t and station i, ${l o g Q}_{i t, T}$ is ${l o g Q}_{i t}$ adjusted to climate conditions during year T, ${\hat{b}}_{i t} (\hat{τ})$ is the estimated regression coefficient for the ith station for year t at the estimated non-exceedance probability value $\hat{τ}$ , ${\bar{X}}_{i, t}$ is the smoothed climate for year t at station i, and ${\bar{X}}_{i, T}$ is the smoothed climate value for year T at station i. For the applications considered here, the year to which the peaks were adjusted was taken as 2018, the end of the study period, so $T = 2018$ .

Because, in the study dataset, the selected regression climate variable ann_max_Q is usually increasing over time (as are all the climate variables considered; refer to Figure 3 and Table 3), it is expected that the usual effect of the adjustment of peaks to 2018 climate conditions will be to increase them. This effect can be seen by comparing the adjusted and observed moments and the adjusted and observed quantiles of the log-transformed peaks (Figure 8 and Figure 9). The moment plots (Figure 8) indicate that a great majority (about 86%) of the means increase, which agrees with the expectation that peaks will be increased by the adjustment process due to the increasing ann_max_Q values, whereas a similarly large majority (about 82%) of standard deviations decrease, and that the change in skew is more balanced (about a 46% increase). The preponderance of reduced standard deviations is also expected as the adjustment process tends to flatten out trends, which in turn reduces the standard deviation. These changes in moments correspond to increases in quantiles (Figure 9); because most of the standard deviations decrease as the means generally increase, the magnitude of the increases in the quantiles measured by the median adjusted-to-observed ratios decreases somewhat as the AEP values decrease. Note that the quantiles portrayed here were obtained nonparametrically by interpolating between plotting positions; fitting the adjusted peaks to a selected distribution such as log-Pearson type 3 is also possible (as was performed by Over et al. [26]) and may be preferred, for example, to satisfy standards or recommendations such as Bulletin 17C [31], or it may be necessary, because estimates for extreme quantiles outside the range of the plotting positions are desired.

An analogous adjustment process can be applied using the single-station quantile regression results. The adjusted single-station moments are nearly the same on average as those from the MMQR-based adjustment (Figure S4).

Another way to characterize the effect of the adjustment process is to consider its effect on the peak-flow trends. Again, because the selected climate variable is usually increasing for most basins, the earlier peaks will usually be adjusted upwards more than the later peaks, which has a negative effect on the peak-flow trends (Figure 10A). One might also expect the adjustment process to reduce the absolute magnitude of the peak-flow trends, but this is not what occurs (Figure 10C, two leftmost boxplots). As shown in Figure 10B, most basins are left of the dashed line, indicating where the adjustment leads to a zero trend as measured by Kendall’s tau. This over-adjustment occurs because the trends in the best-fit climate variable, ann_max_Q, are usually larger than those in the peak flows (Figure 3 and Figure 10C), as well as being more statistically significant than the peak-flow trends (Figure 3). This lack of agreement in trend magnitude and significance indicates that for many basins, some factor other than climate (i.e., basin changes with hydrological effects) is reducing the peak-flow trends or that the MWBM model is not capturing the effects of climate changes accurately, even though it works better than the other tested climate variables to explain the inter-annual fluctuations in peak flows. Regarding watershed changes as a possible cause of reductions in the peak-flow trends, recall that the effects of reservoir regulation and urbanization have been filtered out from the collection of study basins, at least approximately, but other factors, such as changes in agricultural practices, have not been controlled for.

The adjustment process is illustrated for two example basins next. One, presented in Section 4.2.2, has positive trends in both peak flows and climate; the other, presented in Section 4.2.3, has an overall negative peak-flow trend paired with a positive climate trend, so it is anomalous in that sense, but it is in a region where these opposing trends have been recognized for some time and are believed to have resulted from changes in agricultural practices.

4.2.2. Example Basin: Vermilion River near Danville, Illinois

From 1921–2018, the basin-integrated peak-flow climate signal for the Vermilion River near Danville, Illinois (Figure 2), as measured by the selected predictor, sqrt_ann_max_Q, was generally increasing, as were the peak flows; even the shapes of the fitted curves are similar (Figure 11A,D). Based on the MMQR predictions, the

\hat{τ}

values for the peaks each year were estimated (Figure 11B,C), and using those, the interpolated coefficients

\hat{b} (\hat{τ})

were computed by interpolating between the estimated MMQR coefficients (Table 6). Given the interpolated coefficients, the adjusted peak-flow values were computed using Equation (15) (Figure 11D), which are all increasing (Figure 11E,F) because of the generally increasing climate signal (Figure 11A). Because the peak-flow and climate trends are well-matched, the trend in the adjusted peaks is approximately flat (Figure 11D).

4.2.3. Example Basin: Sugar River near Brodhead, Wisconsin

Rivers in the Driftless Area of southwest Wisconsin and northwest Illinois have experienced decreasing peak flows and increasing total flows, low flows, and precipitation [105,106,107,108,109], although the peak trends have reversed more recently [110]. The decreasing flow trends in this region have usually been attributed to decreases in the intensity of agriculture and improved agricultural practices leading to decreased erosion and increased infiltration, although Park and Markus [108] hypothesized that the cause could be partly climatic. The adjustment process for an example basin in this region, U.S. Geological Survey (USGS) streamgage 05436500, Sugar River Near Brodhead, Wisconsin (Figure 2), is illustrated in Figure 12. According to the smoothed trend line, the selected climate variable, sqrt_ann_max_Q, has been increasing throughout the study period (1921–2018), although at a greater rate beginning in about 1985 (Figure 12A). The peaks decreased until about 1990, a few years after the time when the climate variable began increasing more quickly (Figure 12D). Figure 12B,C give time series and scatterplot views of observed peaks relative to the fitted quantile lines from the MMQR analysis, from which the non-exceedance probability τ of each peak is estimated. Based on the estimated probabilities, the associated quantile regression slopes, and the difference between each year’s smoothed climate and that of 2018, the peaks are adjusted to 2018 climate conditions (Equation (15)). Because of the positive climate trend (Figure 12A), the adjusted peaks are all at least as large as the observed peaks (Figure 12D–F), especially in the earlier years of the record (Figure 12D), and because of this, the overall negative trend in the peaks is steeper in the adjusted peaks than in the observed peaks. Although the effect of adjustment on the peak-flow trend may seem anomalous, the adjustment process developed here can only account for the effects of climate, and, based on the literature describing watershed changes in this region, this result is most likely an indication that changes in peak flows in this basin are not the result of changes in climate alone but also likely result from changes in agricultural practices. The adjusted peaks thus reflect an estimate of the peak that would have occurred if the wetter climate (higher ann_max_Q values) from 2018 had occurred together with the historical land-use practices. Therefore, although the process of adjusting the peaks for changes in climate appears to be functioning properly, the adjusted peaks are not providing accurate flood-frequency estimates for 2018 because they do not reflect the substantial watershed changes that have evidently occurred. The regression approaches used in this study can be generalized to include additional variables characterizing land use [27], but it is not clear that suitable variables with long records exist for the Driftless Area.

For illustration, the adjustment of the peaks for this basin based on its single-station quantile regression is also provided (Figure S5). Despite differences in the quantile regression method, which are evident in panel C, the adjustment is quite similar. It is important to note that if predictions at ungauged basins are not needed, this similar result for the single-station adjustment indicates the possible efficacy of adding time as a variable to the regression, as was performed by Glas et al. [39], to model the effect of an unknown trend driver. The analysis of the use of regression on time to address the unknown trend driver problem for single-station analyses is, however, beyond the scope of the present analysis.

5. Discussion and Conclusions

In the context of hydraulic design, that is, regarding the selection of a design discharge with reference to the results of a flood-frequency analysis based on the assumption of stationarity, the effects of climate variability and change on peak flows are a complex but important issue and one of growing concern. This study demonstrates and compares two nonparametric (distribution-agnostic) regression-based methodologies for estimating that effect and applying the results to the problem of estimating the flood-frequency distribution of a selected year in the historical record in the context of a nonstationary flood-frequency study in the midwestern United States.

A method for estimating the effect of climate variation on peak flows requires, first, a way to measure climate. In this study, four measures of climate were considered; two were provided by the discharge simulated using a spatially lumped daily timestep water-balance model, and two by the precipitation used to drive the water-balance model. For both the discharge and precipitation variables, annual maximum daily and annual total values were considered. It was found, according to an adjusted R² criterion, that annual maximum daily simulated discharge, ann_max_Q, was the best predictor of peaks among the measures of climate considered. This result shows the potential of simple hydrologic models applied at a short timestep to be useful in characterizing climate for use in regression models of peak flows.

Despite the good correlations between the peaks and the annual maximum simulated discharge, as evidenced by the adjusted R² values, an anomalous result was observed in the comparison of the trends in the peak flows and those in the climate series considered. Although the Mann–Kendall trends in the peaks are positive for a majority of the study basins, this majority is not large (57%), and relatively few (28%) of these trends are significant (

p < 0.05

). In contrast, a large majority (86–91%) of the climate trends (except those for the annual maximum daily precipitation) are positive, and many are significantly so. This disagreement in the direction of the peak-flow and climate trends has important implications for the application of these results, which are discussed further below.

The two approaches to regression that were tested and compared were a single-station approach, where a regression model was fitted to each basin separately with quantile-dependent intercepts and slopes, and a regional approach based on a fixed-effect panel-model regression framework, in which the climate coefficients are common across all the basins with their distributional differences among basins accounted for by two sets of fixed effects—one set a single location effect (like an intercept) that models differences in the mean and the other a quantile-dependent scale effect that models differences in the scale or variance. In both approaches, the primary results were obtained by using a form of quantile regression, which allows the determination of different coefficients for different quantiles, and both assume a linear functional form between the independent and dependent variables (the climate and the peak flows, respectively). For the single-station regressions, and for introductory trend analyses using Kendall’s tau, it was further assumed that the peak-flow and climate time series are without serial correlation.

Different power transformations of the peaks and climate variables in the regressions were tested, and although the models fitted using square-root-transformed peaks with square-root-transformed ann_max_Q performed somewhat better, according to an adjusted R² criterion, than those fitted using log-transformed peaks and square-root-transformed ann_max_Q, particularly for the panel regressions, this paper focuses on the results from the latter because of the greater ease in interpreting the results. For all pairs of transformations that were tested, the quantile regression coefficients varied among quantiles monotonically with exceedance probability, and for the panel model analysis, these variations with exceedance probability were significantly different. For the single-station regressions, the variations may not generally be statistically significant, but the agreement with the panel model results indicates they are real. However, the direction of the variation in quantile regression coefficients depends on the selected transformations, which indicates it may be possible to select a pair of transformations for which there is no such variation, which would simplify subsequent use. Nevertheless, performing the quantile regression analysis would be required to demonstrate this lack of variation.

The choice between single-station and panel regressions has many dimensions. The central tendencies of single-station regression coefficients approximately agree with panel model coefficients, but there is a wide range of single-station coefficient values around their central tendency, and the uncertainties (standard errors) of the single station coefficients, although usually small enough that the coefficients are significantly different from zero (

p < 0.05

), are much larger than the panel model coefficient standard errors (Section 3.3). Further, the use of the panel model allows the estimation of coefficients across a much wider range of exceedance probability values, and the panel model coefficients are ostensibly applicable to any basin throughout the study region that is hydrologically similar to those used in the study, therefore opening up the possibility of application to ungauged basins. Hydrological similarity has not been defined as part of this study, but to start with, it means at least that filtering for reservoir regulation and urbanization, as was performed in this study, should be applied to any candidate basin. Beyond those filters, there may be wide applicability, apart from the problem of the disagreement of peak-flow and climate trends, considering the wide range of basin properties among the study basins (Table 1). Although the MWBM model is simple, the use of discharge simulated by such a model has the potential to remove much of the variation between basins in the hydrologic response to changes in climate because differences among basins in hydrologic processes related to the generation of annual maxima, such as the role of snow and patterns of antecedent moisture conditions, are at least approximately accounted for by the model. Nevertheless, it was estimated that about half the single-station regression coefficients are significantly different (

p < 0.05

) from the MMQR values, which is evidence that the panel regression coefficients do not always capture all the information available regarding responses to climate fluctuations from the study basins.

Another perspective on the tradeoff between single-station and regional fitting is provided by comparing the two quantile-regression-focused goodness-of-fit measures that were applied. The sum of weighted quantile deviations across all stations was smaller for the single-station quantile regressions than for the panel quantile regressions (Table 6), presumably because the single-station regressions are less constrained than the panel regressions. The flipside of this flexibility in the fitting of the single-station regressions is that they are much more likely than the panel regressions to have quantile regression lines that cross (Table 7). Notably, it is also possible to constrain single-station quantile regressions to prevent crossing of the quantile regression lines [81,82], and one could compare the results of such a method to the results of the methods considered here. It is likely the results would lie between the results of the present methods for both statistics as more constraints are being imposed compared to the unconstrained single-station regression, but fewer than for the MMQR method. Nevertheless, the present comparison using unconstrained single-station quantile regressions seems sufficient for this initial analysis.

Applications to three types of basins were discussed: gauged study basins, gauged basins in the study region not used in the study, and ungauged basins in the study region. Both single-station and panel regressions could be used with gauged basins, but only the panel regressions can be used to make predictions at ungauged basins because the panel regression returns a single set of coefficients that are regionally applicable. Certain uses of the panel regression approach in ungauged basins require estimates of the fixed effects in addition to the regional coefficients. It was shown that the estimation of the fixed effects using static basin characteristics is possible, and the effect of estimating these effects on the uncertainty of quantile predictions was assessed using a LOOCV analysis. However, the search for basin characteristics and subregions in which the relationships are well-defined was not performed exhaustively, and more accurate estimates are likely possible. Indeed, it is likely that the fit of the MMQR model could also be improved by searching for more homogeneous subregions. To apply the results to ungauged basins with confidence, the resolution of the problem of determining for which basins climate is the only driver of change in peak flows is vital. Further, it is important to emphasize that applications to future climate conditions may be affected by changes in the relationships between peak-flow and climate variables resulting from changes in climate conditions.

Estimation of the flood-frequency distribution for a selected year of the climate record (in this paper, the last, 2018, was used) for gauged basins via adjustment of the observed peaks based on a smoothed version of the climate trend and the quantile regression coefficients was considered in detail. Because the climate trends are mostly positive, this adjustment results in increases in the peak flows for most basins. Distributionally, it was seen that the means of the log-transformed peaks increased for most basins, whereas the standard deviations decreased, and changes in skew were approximately balanced. As a result of the increasing mean coupled with decreasing standard deviation, the mean ratio of adjusted to observed peak flows decreased with decreasing AEP, although the quantiles considered (AEPs of 0.5, 0.1, 0.04, and 0.01) all increase for most basins.

From the perspective of the effects on the peak-flow trends, because the simulated climate trends are, outside of the central and southeast parts of the study area, usually larger than the observed peak-flow terms (although both are mostly positive, Figure 2 and Figure 3), adjustment of peaks in this way tends to over-adjust the peaks in the sense that the adjusted peaks have predominately negative trends (Figure 10). This over-adjustment could arise from one or both of two sources: climatically driven change being over-simulated by the water-balance model, or changes in basin properties (land use/land cover change, changes in agricultural practices, or other anthropogenic effects such as construction of storage or diversions) that were not eliminated by the filtering of the study basins for reservoir regulation and urbanization. Because the water-balance model includes the basic processes of runoff generation, the effect of increases in precipitation should be offset, at least in part, by the effects of increases in temperature on evaporation and soil moisture, so a systematic bias of this type might indicate that the offsetting effects of temperature increases are biased low. Further examination of the model processes for basins showing larger differences between peak-flow and climate trends may also reveal some model process-related cause, perhaps related to differences in and changes to seasonality. An additional way to test the model would be to see if a different model gave the same result. It is important to note that the relative lack of (positive) trends in peak flows despite positive trends in precipitation extremes is a common observation [17], so a resolution of this anomaly in terms of hydrological modeling, if that is a source, may not be straightforward. Regarding watershed changes, some widespread change or set of changes in basin properties may be reducing peak-flow trends, even though reservoir regulation has been at least approximately filtered out, but if so, its nature is not currently known in general. Like with the model processes, it may be that the examination of basin properties for basins showing larger differences between peak-flow and climate trends may indicate some hypotheses. For example, one cluster of basins having peak flow-climate (as measured by ann_max_Q) trend anomalies is in southwest Wisconsin, in and near the Driftless Area, which, as discussed, is known to have experienced, until recently, decreasing high flows alongside increasing low flows and precipitation, and this outcome has usually been hypothesized to be due to changes in agricultural practices (Section 4.2.3). This example indicates that explaining why climate trends are often more positive than peak-flow trends across the study region may require substantial local knowledge. In addition, to model the change in basin properties jointly with the changes in climate, a quantitative variable describing those properties is essential. Nevertheless, in the context of analysis of a single gauged station, regression on time may be used as a workaround for an unknown climate driver [39], although that approach was not investigated in this study as it is not applicable to ungauged basins.

Despite the sometimes-contrasting behavior of the peak-flow and climate trends, which would challenge any methodology attempting to explain trends in peak flows due to changes in climate, and the open questions raised by that behavior, this study makes several contributions. It shows the potential for regional regression equations to be extended from space to space-time, which allows predictions of change to be made at ungauged basins without invoking the assumptions involved in trading space for time. It further shows that by using quantile regression, the possibly varying dependence of peak flows on climate can be assessed, and the regional-panel approach allows this dependence to be estimated for a wide range of exceedance probabilities. Finally, it demonstrates the potential for simple water-balance models, simulating at a daily time scale, to integrate the effects of precipitation and temperature change and thereby provide discharge predictions that are good predictors of peaks in the sense of regression fitting. It was not among the objectives of this study to resolve why the peak flows do not generally increase along with the climate signal, nor to determine the effects of various kinds of land-use changes throughout the study region or investigate other possible causes for the lack of full agreement between peak-flow and climate trends. These important questions, for which answers are needed to be able to apply the proposed methodology generally, must be left for future work. Given those answers, the methodology itself could be applied in a manner that is very similar to what is presented here.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/hydrology12050119/s1, Figure S1: Scatterplots of selected trend Kendall’s tau (Ktau) estimates; Figure S2: Coefficients and coefficient standard errors of selected alternative models; Figure S3: Scatterplots comparing method-of-moments quantile regression (MMQR) fixed effects computed using all the study basins (“full-model”) or predicted using leave-one-out cross-validation (LOOCV): (A) location effects and (B) scale effects. Figure S4: Scatterplots comparing adjusted moments of log-transformed peak flows (in ft³/s) using single-station quantile regression (QR) and method-of-moments quantile regression (MMQR) methods; Figure S5: Adjustment of peak flows to 2018 climate conditions at U.S. Geological Survey streamgage 05436500, Sugar River Near Brodhead, Wisconsin, using single-station quantile regression (QR) analysis; Table S1: Adjusted coefficient of determination (R²) values of the least-squares single-station and location-model panel regressions with selected climate variables, peak-flow and climate variable transformations, and approaches to water-balance model (MWBM) calibration; and Table S2: Regression coefficients and related statistics of selected alternative models.

Author Contributions

T.O.: conceptualization, methodology, formal analysis, writing—original draft, software. M.M.: data curation, formal analysis, software, writing—review and editing. H.P.: software. All authors have read and agreed to the published version of the manuscript.

Funding

This work was performed by the U.S. Geological Survey as part of the U.S. Federal Highway Administration Transportation Pooled Fund study TPF-5(460), with funding from the Illinois Department of Transportation (DOT), Iowa DOT, Michigan DOT, Minnesota DOT, Missouri DOT, Montana Department of Natural Resources and Conservation, North Dakota Department of Water Resources, South Dakota DOT, and Wisconsin DOT. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Data Availability Statement

The raw data used in this study are publicly available. The peak-flow data are from the U.S. Geological Survey, National Water Information System database (https://doi.org/10.5066/F7P55KJN). The climate data used [94] were downloaded from https://cirrus.ucsd.edu/~pierce/nonsplit_precip/, accessed on 28 November 2023. Selected data products developed as part of this study, including the station list, the inputs to the regression analyses, and selected single-station regression results, have been compiled and published as a U.S. Geological Survey data release [79]. This data release also includes a model archive, which is a working archive of the scripts and data used to produce the main results of the paper, including the MWBM discharge modeling and the two approaches to regression modeling.

Acknowledgments

The authors are grateful to João Santos Silva, School of Economics, University of Surrey, for discussions of his publication on the quantiles via moments approach to estimating quantiles using panel regression; to Gregory McCabe, U.S. Geological Survey (USGS), for discussions of using the MWBM model at a daily time step; to David Pierce, Scripps Institution of Oceanography, for help in accessing the climate dataset used in this analysis; to Sara Levin, USGS, for discussions of nonstationary flood-frequency updating methods; to Lindsey Schafer, USGS, for help in preparing the figures; and to Jaqueline Ortiz, USGS, for help formatting the manuscript. We would also like to thank Chandramauli Awasthi, North Carolina State University, and Robin Glas, USGS, for many helpful comments and suggestions in their reviews of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Consider a climate process

C_{t}

given by the sum of a time-varying mean

\bar{C_{t}}

and an identically distributed mean-zero noise term

ϵ_{c, t}

:

C_{t} = \bar{C_{t}} + ϵ_{c, t} .

(A1)

Then, shift the climate values each year t by the change in the mean between the year and some selected year T by adding

Δ {\bar{C}}_{T, t} = {\bar{C}}_{T} - \bar{C_{t}}

. This shift generates a new process

C_{T, t}^{'}

, defined as:

C_{T, t}^{'} = C_{t} + Δ {\bar{C}}_{T, t} = \bar{C_{t}} + ϵ_{c, t} + {\bar{C}}_{T} - \bar{C_{t}} = {\bar{C}}_{T} + ϵ_{c, t},

(A2)

which, being the sum of

{\bar{C}}_{T} + ϵ_{c, t}

, is stationary.

Next, define the peak-flow process

Y_{t}

as a linear function of the climate

C_{t}

:

Y_{t} = a + {b C}_{t} + ε_{Y, t},

(A3)

(compare Equations (1) and (9)). Because

C_{t} = \bar{C_{t}} + ϵ_{c, t}

(Equation (A1)),

Y_{t}

can be rewritten as:

Y_{t} = a + b [\bar{C_{t}} + ϵ_{c, t}] + ε_{Y, t},

(A4)

then shift

Y_{t}

to year

T

by adding the quantity

b [{\bar{C}}_{T} - \bar{C_{t}}]

as in Equation (15) (the dependence on

τ

is immaterial for present purposes), i.e.,

Y_{T, t}^{'} = Y_{t} + b [{\bar{C}}_{T} - \bar{C_{t}}] .

(A5)

Substituting for

Y_{t}

using Equation (A4) gives

Y_{T, t}^{'} = a + b [\bar{C_{t}} + ϵ_{c, t}] + ε_{Y, t} + b [{\bar{C}}_{T} - \bar{C_{t}}]

(A6)

which simplifies to

Y_{T, t}^{'} = a + b [{\bar{C}}_{T} + ϵ_{c, t}] + ε_{Y, t} .

(A7)

However, from Equation (A1),

{\bar{C}}_{T} + ϵ_{c, t} = C_{T}

, so Equation (A7) further simplifies to:

Y_{T, t}^{'} = a + b C_{T} + ε_{Y, t},

(A8)

which shows that

Y_{T, t}^{'}

is simply the peak-flow process at year

T

, i.e.,

Y_{T}

(compare Equation (A3)). Therefore, for a linear peak-flow versus climate relationship with slope coefficient

b

, with climate given by a time-varying mean

\bar{C_{t}}

plus a noise (Equation (A1)), shifting by

b [{\bar{C}}_{T} - \bar{C_{t}}]

adjusts the peaks to the selected year

T

, as is assumed in Section 4.2.1.

References

Hodgkins, G.A.; Dudley, R.W.; Archfield, S.A.; Renard, B. Effects of climate, regulation, and urbanization on historical flood trends in the United States. J. Hydrol. 2019, 573, 697–709. [Google Scholar] [CrossRef]
Koutsoyiannis, D.; Montanari, A. Negligent killing of scientific concepts: The stationarity case. Hydrolog. Sci. J. 2015, 60, 1174–1183. [Google Scholar] [CrossRef]
Milly, P.C.D.; Betancourt, J.; Falkenmark, M.; Hirsch, R.M.; Kundzewicz, Z.W.; Lettenmaier, D.P.; Stouffer, R.J. Stationarity is dead: Whither water management? Science 2008, 319, 573–574. [Google Scholar] [CrossRef] [PubMed]
Milly, P.C.D.; Betancourt, J.; Falkenmark, M.; Hirsch, R.M.; Kundzewicz, Z.W.; Lettenmaier, D.P.; Stouffer, R.J.; Dettinger, M.D.; Krysanova, V. On critiques of “Stationarity is dead: Whither water management?”. Water Resour. Res. 2015, 51, 7785–7789. [Google Scholar] [CrossRef]
Ryberg, K.R. (Ed.) Attribution of Monotonic Trends and Change Points in Peak Streamflow Across the Conterminous United States Using a Multiple Working Hypotheses Framework, 1941–2015 and 1966–2015; Professional Paper 1869; U.S. Geological Survey: Reston, VA, USA, 2022. [Google Scholar] [CrossRef]
Salas, J.D.; Obeysekera, J.; Vogel, R.M. Techniques for assessing water infrastructure for nonstationary extreme events: A review. Hydrolog. Sci. J. 2018, 63, 325–352. [Google Scholar] [CrossRef]
Salas, J.D.; Obeysekera, J. Revisiting the concepts of return period and risk for nonstationary hydrologic extreme events. J. Hydrol. Eng. 2014, 19, 554–568. [Google Scholar] [CrossRef]
Serago, J.M.; Vogel, R.M. Parsimonious nonstationary flood frequency analysis. Adv. Water Resour. 2018, 112, 1–16. [Google Scholar] [CrossRef]
Serinaldi, F.; Kilsby, C.G. Stationarity is undead—Uncertainty dominates the distribution of extremes. Adv. Water Resour. 2015, 77, 17–36. [Google Scholar] [CrossRef]
Villarini, G.; Serinaldi, F.; Smith, J.A.; Krajewski, W.F. On the stationarity of annual flood peaks in the continental United States during the 20th century. Water Resour. Res. 2009, 45, W08417. [Google Scholar] [CrossRef]
Villarini, G.; Taylor, S.; Wobus, C.; Vogel, R.; Hecht, J.; White, K.; Baker, B.; Gilroy, K.; Olsen, J.R.; Raff, D. Floods and Nonstationarity: A Review; Civil Works Technical Series 2018–01; U.S. Army Corps of Engineers: Washington, DC, USA, 2018; Available online: https://usace.contentdm.oclc.org/digital/collection/p266001coll1/id/6036/ (accessed on 21 March 2019).
Vogel, R.M.; Yaindl, C.; Walter, M. Nonstationarity—Flood magnification and recurrence reduction factors in the United States. J. Am. Water Resour. Assoc. 2011, 47, 464–474. [Google Scholar] [CrossRef]
Archfield, S.A.; Hirsch, R.M.; Viglione, A.; Blöschl, G. Fragmented patterns of flood change across the United States. Geophys. Res. Lett. 2016, 43, 10232–10239. [Google Scholar] [CrossRef] [PubMed]
Berghuijs, W.R.; Aalbers, E.E.; Larsen, J.R.; Trancoso, R.; Woods, R.A. Recent changes in extreme floods across multiple continents. Environ. Res. Lett. 2017, 12, 114035. [Google Scholar] [CrossRef]
Hirsch, R.M.; Ryberg, K.R. Has the magnitude of floods across the USA changed with global CO₂ levels? Hydrolog. Sci. J. 2012, 57, 1–9. [Google Scholar] [CrossRef]
Hodgkins, G.A.; Whitfield, P.H.; Burn, D.H.; Hannaford, J.; Renard, B.; Stahl, K.; Fleig, A.K.; Madsen, H.; Mediero, L.; Korhonen, J.; et al. Climate-driven variability in the occurrence of major floods across North America and Europe. J. Hydrol. 2017, 552, 704–717. [Google Scholar] [CrossRef]
Sharma, A.; Wasko, C.; Lettenmaier, D.P. If precipitation extremes are increasing, why aren’t floods? Water Resour. Res. 2018, 54, 8545–8551. [Google Scholar] [CrossRef]
Koutsoyiannis, D. Hurst-Kolmogorov dynamics and uncertainty. J. Am. Water Resour. Assoc. 2011, 47, 481–495. [Google Scholar] [CrossRef]
Lins, H.F.; Cohn, T.A. Stationarity—Wanted dead or alive? J. Am. Water Resour. Assoc. 2011, 47, 475–480. [Google Scholar] [CrossRef]
Payton, E.A.; Pinson, A.O.; Asefa, T.; Condon, L.E.; Dupigny-Giroux, L.-A.L.; Harding, B.L.; Kiang, J.; Lee, D.H.; McAfee, S.A.; Pflug, J.M.; et al. Ch. 4. Water. In Fifth National Climate Assessment; Crimmins, A.R., Avery, C.W., Easterling, D.R., Kunkel, K.E., Stewart, B.C., Maycock, T.K., Eds.; U.S. Global Change Research Program: Washington, DC, USA, 2023. [Google Scholar] [CrossRef]
Blum, A.G.; Ferraro, P.J.; Archfield, S.A.; Ryberg, K.R. Causal effect of impervious cover on annual flood magnitude for the United States. Geophys. Res. Lett. 2020, 47, e2019GL086480. [Google Scholar] [CrossRef]
Espey, W.H., Jr.; Winslow, D.E. Urban flood frequency characteristics. J. Hydr. Div.-ASCE 1974, 100, 279–293. [Google Scholar] [CrossRef]
Hollis, G.E. The effect of urbanization on floods of different recurrence interval. Water Resour. Res. 1975, 11, 431–435. [Google Scholar] [CrossRef]
Konrad, C.P. Effects of Urban Development on Floods; Fact Sheet 076–03; U.S. Geological Survey: Reston, VA, USA, 2003. Available online: https://pubs.usgs.gov/fs/fs07603/ (accessed on 7 May 2009).
Over, T.M.; Saito, R.J.; Soong, D.T. Adjusting Annual Maximum Peak Discharges at Selected Stations in Northeastern Illinois for Changes in Land-Use Conditions; Scientific Investigations Report 2016–5049; U.S. Geological Survey: Reston, VA, USA, 2016. [Google Scholar] [CrossRef]
Over, T.M.; Saito, R.J.; Veilleux, A.G.; O’Shea, P.S.; Sharpe, J.B.; Soong, D.T.; Ishii, A.L. Estimation of Peak Discharge Quantiles for Selected Annual Exceedance Probabilities in Northeastern Illinois; Scientific Investigations Report 2016–5050; U.S. Geological Survey: Reston, VA, USA, 2021. [Google Scholar] [CrossRef]
Over, T.; Marti, M.; Ortiz, J.; Podzorski, H. The joint effect of changes in urbanization and climate on trends in floods: A comparison of panel and single-station quantile regression approaches. J. Hydrol. 2025, 648, 132281. [Google Scholar] [CrossRef]
Sauer, V.B.; Thomas, W.O., Jr.; Stricker, V.A.; Wilson, K.V. Flood Characteristics of Urban Watersheds in the United States; Water-Supply Paper 2207; U.S. Geological Survey: Reston, VA, USA, 1983. [Google Scholar] [CrossRef]
Yang, W.; Yang, H.; Yang, D.; Hou, A. Causal effects of dams and land cover changes on flood changes in mainland China. Hydrol. Earth Syst. Sci. 2021, 25, 2705–2720. [Google Scholar] [CrossRef]
Wasko, C.; Westra, S.; Nathan, R.; Orr, H.G.; Villarini, G.; Villalobos Herrera, R.; Fowler, H.J. Incorporating climate change in flood estimation guidance. Philos. Trans. R. Soc. A 2021, 379, 20190548. [Google Scholar] [CrossRef] [PubMed]
England, J.F., Jr.; Cohn, T.A.; Faber, B.A.; Stedinger, J.R.; Thomas, W.O., Jr.; Veilleux, A.G.; Kiang, J.E.; Mason, R.R., Jr. Guidelines for Determining Flood Flow Frequency—Bulletin 17C (Ver. 1.1, May 2019); Techniques and Methods 4-B5; U.S. Geological Survey: Reston, VA, USA, 2019. [Google Scholar] [CrossRef]
François, B.; Schlef, K.E.; Wi, S.; Brown, C.M. Design considerations for riverine floods in a changing climate—A review. J. Hydrol. 2019, 574, 557–573. [Google Scholar] [CrossRef]
Kilgore, R.; Thomas, W.O., Jr.; Douglass, S.; Webb, B.; Hayhoe, K.; Stoner, A.; Jacobs, J.M.; Thompson, D.; Hermann, G.; Douglas, E.; et al. Applying Climate Change Information to Hydrologic and Coastal Design of Transportation Infrastructure [Design Practices Guide]; National Cooperative Highway Research Program, Transportation Research Board: Washington, DC, USA, 2019; p. 145. Available online: https://apps.trb.org/cmsfeed/trbnetprojectdisplay.asp?projectid=4046 (accessed on 28 November 2023).
Kilgore, R.; Thomas, W.O., Jr.; Douglass, S.; Webb, B.; Hayhoe, K.; Stoner, A.; Jacobs, J.; Thompson, D.B.; Hermann, G.R.; Douglas, E.; et al. Applying Climate Change Information to Hydrologic and Coastal Design of Transportation Infrastructure [Final Report]; National Cooperative Highway Research Program, Transportation Research Board: Washington, DC, USA, 2019; p. 201. Available online: https://apps.trb.org/cmsfeed/trbnetprojectdisplay.asp?projectid=4046 (accessed on 28 November 2023).
Madsen, H.; Lawrence, D.; Lang, M.; Martinkova, M.; Kjeldsen, T.R. Review of trend analysis and climate change projections of extreme precipitation and floods in Europe. J. Hydrol. 2014, 519 Pt D, 3634–3650. [Google Scholar] [CrossRef]
Schlef, K.E.; François, B.; Brown, C. Comparing flood projection approaches across hydro-climatologically diverse United States river basins. Water Resour. Res. 2021, 57, e2019WR025861. [Google Scholar] [CrossRef]
Slater, L.J.; Anderson, B.; Buechel, M.; Dadson, S.; Han, S.; Harrigan, S.; Kelder, T.; Kowal, K.; Lees, T.; Matthews, T.; et al. Nonstationary weather and water extremes: A review of methods for their detection, attribution, and management. Hydrol. Earth Syst. Sci. 2021, 25, 3897–3935. [Google Scholar] [CrossRef]
Schlef, K.E.; François, B.; Robertson, A.W.; Brown, C. A general methodology for climate-informed approaches to long-term flood projection—Illustrated with the Ohio River Basin. Water Resour. Res. 2018, 54, 9321–9341. [Google Scholar] [CrossRef]
Glas, R.; Hecht, J.; Simonson, A.; Gazoorian, C.; Schubert, C. Adjusting design floods for urbanization across groundwater-dominated watersheds of Long Island, NY. J. Hydrol. 2023, 618, 129194. [Google Scholar] [CrossRef]
Hecht, J.S.; Barth, N.A.; Ryberg, K.R.; Gregory, A.E. Simulation experiments comparing nonstationary design-flood adjustments based on observed annual peak flows in the conterminous United States. J. Hydrol. X 2022, 17, 100115. [Google Scholar] [CrossRef]
Hecht, J.S.; Vogel, R.M. Updating urban design floods for changes in central tendency and variability using regression. Adv. Water Resour. 2020, 136, 103484. [Google Scholar] [CrossRef]
Luke, A.; Vrugt, J.A.; AghaKouchak, A.; Matthew, R.; Sanders, B.F. Predicting nonstationary flood frequencies: Evidence supports an updated stationarity thesis in the United States. Water Resour. Res. 2017, 53, 5469–5494. [Google Scholar] [CrossRef]
Benson, M. Factors Influencing the Occurrence of Floods in a Humid Region of Diverse Terrain; Water Supply Paper 1580-B; U.S. Geological Survey: Washington, DC, USA, 1963. [Google Scholar] [CrossRef]
Dalrymple, T. Flood-Frequency Analyses, Manual of Hydrology: Part 3; Water Supply Paper 1543-A; U.S. Geological Survey: Washington, DC, USA, 1960. [Google Scholar] [CrossRef]
Stedinger, J.R.; Vogel, R.M.; Foufoula-Georgiou, E. Frequency analysis of extreme events. In Handbook of Hydrology; Maidment, D.R., Ed.; McGraw-Hill: New York, NY, USA, 1993; pp. 18.1–18.66. ISBN 0070397325. [Google Scholar]
Eng, K.; Chen, Y.-Y.; Kiang, J.E. User’s Guide to the Weighted-Multiple-Linear Regression Program (WREG Version 1.0); Techniques and Methods 4-A8; U.S. Geological Survey: Reston, VA, USA, 2009. [Google Scholar] [CrossRef]
Farmer, W.H.; Kiang, J.E.; Feaster, T.D.; Eng, K. Regionalization of Surface-Water Statistics Using Multiple Linear Regression (Ver. 1.1, February 2021); Techniques and Methods, 4-A12; U.S. Geological Survey: Reston, VA, USA, 2021. [Google Scholar] [CrossRef]
Singh, R.; Wagener, T.; van Werkhoven, K.; Mann, M.E.; Crane, R. A trading-space-for-time approach to probabilistic continuous streamflow predictions in a changing climate—Accounting for changing watershed behavior. Hydrol. Earth Syst. Sci. 2011, 15, 3591–3603. [Google Scholar] [CrossRef]
Singh, R.; van Werkhoven, K.; Wagener, T. Hydrological impacts of climate change in gauged and ungauged watersheds of the Olifants basin: A trading-space-for-time approach. Hydrolog. Sci. J. 2014, 59, 29–55. [Google Scholar] [CrossRef]
Berghuijs, W.R.; Woods, R.A. Correspondence: Space-time asymmetry undermines water yield assessment. Nat. Commun. 2016, 7, 11603. [Google Scholar] [CrossRef] [PubMed]
Perdigão, R.A.P.; Blöschl, G. Spatiotemporal flood sensitivity to annual precipitation: Evidence for landscape-climate coevolution. Water Resour. Res. 2014, 50, 5492–5509. [Google Scholar] [CrossRef]
Wooldridge, J.M. Introductory Econometrics: A Modern Approach, 5th ed.; South-Western, Cengage Learning: Mason, OH, USA, 2013; p. 881. ISBN 1111531048. [Google Scholar]
Anderson, B.J.; Slater, L.J.; Dadson, S.J.; Blum, A.G.; Prosdocimi, I. Statistical attribution of the influence of urban and tree cover change on streamflow: A comparison of large sample statistical approaches. Water Resour. Res. 2022, 58, e2021WR030742. [Google Scholar] [CrossRef]
Bassiouni, M.; Vogel, R.M.; Archfield, S.A. Panel regressions to estimate low-flow response to rainfall variability in ungaged basins. Water Resour. Res. 2016, 52, 9470–9494. [Google Scholar] [CrossRef]
Ferreira, S.; Ghimire, R. Forest cover, socioeconomics, and reported flood frequency in developing countries. Water Resour. Res. 2012, 48, 2011WR011701. [Google Scholar] [CrossRef]
Steinschneider, S.; Yang, Y.-C.E.; Brown, C. Panel regression techniques for identifying impacts of anthropogenic landscape change on hydrologic response. Water Resour. Res. 2013, 49, 7874–7886. [Google Scholar] [CrossRef]
Greene, W.H. Econometric Analysis, 3rd ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 1997; p. 1075. ISBN 0023466022. [Google Scholar]
Koenker, R.; Bassett, G. Regression quantiles. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
Frumento, P.; Bottai, M. Parametric modeling of quantile regression coefficient functions. Biometrics 2016, 72, 74–84. [Google Scholar] [CrossRef]
Ouali, D.; Chebana, F.; Ouarda, T.B.M.J. Quantile regression in regional frequency analysis: A better exploitation of the available information. J. Hydrometeorol. 2016, 17, 1869–1883. [Google Scholar] [CrossRef]
Sankarasubramanian, A.; Lall, U. Flood quantiles in a changing climate: Seasonal forecasts and causal relation. Water Resour. Res. 2003, 39, 1134. [Google Scholar] [CrossRef]
Konrad, C.; Restivo, D. Assessment and significance of the frequency domain for trends in annual peak streamflow. J. Flood Risk Manag. 2021, 14, e12761. [Google Scholar] [CrossRef]
Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T.B.M.J. Non-stationary hydrologic frequency analysis using B-spline quantile regression. J. Hydrol. 2017, 554, 532–544. [Google Scholar] [CrossRef]
Qu, C.; Li, J.; Yan, L.; Yan, P.; Cheng, F.; Lu, D. Non-stationary flood frequency analysis using cubic b-spline-based GAMLSS model. Water 2020, 12, 1867. [Google Scholar] [CrossRef]
Ryberg, K.R. (Ed.) Peak Streamflow Trends and Their Relation to Changes in Climate in Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin; Scientific Investigations Report 2023–5064; U.S. Geological Survey: Reston, VA, USA, 2024. [Google Scholar] [CrossRef]
Villarini, G.; Smith, J.A.; Baeck, M.L.; Vitolo, R.; Stephenson, D.B.; Krajewski, W.F. On the frequency of heavy rainfall for the Midwest of the United States. J. Hydrol. 2011, 400, 103–120. [Google Scholar] [CrossRef]
Villarini, G.; Slater, L.J. Examination of changes in annual maximum gauge height in the continental United States using quantile regression. J. Hydrol. Eng. 2018, 23, 06017010. [Google Scholar] [CrossRef]
Awasthi, C.; Archfield, S.A.; Reich, B.J.; Sankarasubramanian, A. Beyond simple trend tests: Detecting significant changes in design-flood quantiles. Geophys. Res. Lett. 2023, 50, e2023GL103438. [Google Scholar] [CrossRef]
Koenker, R. Quantile regression for longitudinal data. J. Multivar. Anal. 2004, 91, 74–89. [Google Scholar] [CrossRef]
Galvao, A.F.; Kato, K. Quantile regression methods for longitudinal data. In Handbook of Quantile Regression; Koenker, R., Chernozhukov, V., He, X., Peng, L., Eds.; CRC Press, Taylor and Francis Group: Boca Raton, FL, USA, 2017; pp. 363–380. ISBN 978-1498725286. [Google Scholar]
Canay, I.A. A simple approach to quantile regression for panel data. Econom. J. 2011, 14, 368–386. [Google Scholar] [CrossRef]
Machado, J.A.F.; Santos Silva, J.M.C. Quantiles via moments. J. Econom. 2019, 213, 145–173. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing (Version 4.3.1) [Computer Software]; R Foundation for Statistical Computing: Vienna, Austria, 2023; Available online: https://www.R-project.org (accessed on 28 November 2023).
Koenker, R. quantreg: Quantile Regression [R]. 2023. Available online: https://CRAN.R-project.org/package=quantreg (accessed on 28 November 2023).
Dhaene, G.; Jochmans, K. Split-panel jackknife estimation of fixed-effect models. Rev. Econ. Stud. 2015, 82, 991–1030. [Google Scholar] [CrossRef]
Croissant, Y.; Millo, G. Panel data econometrics in R: The plm package. J. Stat. Softw. 2008, 27, 1–43. [Google Scholar] [CrossRef]
Millo, G. Robust standard error estimators for panel models: A unifying approach. J. Stat. Softw. 2017, 82, 1–27. [Google Scholar] [CrossRef]
Arellano, M. Computing robust standard errors for within-groups estimators. Oxf. Bull. Econ. Stat. 1987, 49, 431–434. [Google Scholar] [CrossRef]
Marti, M.K.; Podzorski, H.L.; Over, T.M. Data for Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends [Dataset]; U.S. Geological Survey: Reston, VA, USA, 2025. [Google Scholar] [CrossRef]
Helsel, D.R.; Hirsch, R.M.; Ryberg, K.R.; Archfield, S.A.; Gilroy, E.J. Statistical Methods in Water Resources; Techniques and Methods 4-A3; U.S. Geological Survey: Reston, VA, USA, 2020. [Google Scholar] [CrossRef]
He, X. Quantile curves without crossing. Am. Stat. 1997, 51, 186–192. [Google Scholar] [CrossRef]
Zhao, Q. Restricted regression quantiles. J. Multivar. Anal. 2000, 72, 78–99. [Google Scholar] [CrossRef]
Mann, H.B. Nonparametric tests against trend. Econometrica 1945, 13, 245–259. [Google Scholar] [CrossRef]
Dakota Water Science Center. Flood-Frequency Analysis in the Midwest: Addressing Potential Nonstationary Annual Peak-Flow Records. U.S. Geological Survey. Available online: https://www.usgs.gov/centers/dakota-water/science/flood-frequency-analysis-midwest-addressing-potential-nonstationary (accessed on 21 June 2024).
Ryberg, K.R.; Over, T.M.; Levin, S.B.; Heimann, D.C.; Barth, N.A.; Marti, M.K.; O’Shea, P.S.; Sanocki, C.A.; Williams-Sether, T.J.; Wavra, H.N.; et al. Introduction and methods of analysis for peak streamflow trends and their relation to changes in climate in Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin. In Ch. A of Peak Streamflow Trends and Their Relation to Changes in Climate in Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin; Scientific Investigations Report 2023–5064; Ryberg, K.R., Ed.; U.S. Geological Survey: Reston, VA, USA, 2024. [Google Scholar] [CrossRef]
Ryberg, K.R.; Williams-Sether, T. Peak streamflow trends in North Dakota and their relation to changes in climate, water years 1921–2020. In Ch. H of Peak Streamflow Trends and Their Relation to Changes in Climate in Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin; Scientific Investigations Report 2023–5064; Ryberg, K.R., Ed.; U.S. Geological Survey: Reston, VA, USA, 2025. [Google Scholar] [CrossRef]
Marti, M.K.; Ryberg, K.R. Method for Identification of Reservoir Regulation Within U.S. Geological Survey Streamgage Basins in the Central United States Using a Decadal Dam Impact Metric; Open-File Report 2023–1034; U.S. Geological Survey: Reston, VA, USA, 2023. [Google Scholar] [CrossRef]
Dewitz, J. National Land Cover Database (NLCD) 2019 Products (Ver. 2.0, June 2021) [Dataset]; U.S. Geological Survey data release; U.S. Geological Survey: Reston, VA, USA, 2021. [Google Scholar] [CrossRef]
Wickham, J.; Stehman, S.V.; Sorenson, D.G.; Gass, L.; Dewitz, J.A. Thematic accuracy assessment of the NLCD 2019 land cover for the conterminous United States. GISci. Remote Sens. 2023, 60, 2181143. [Google Scholar] [CrossRef] [PubMed]
Falcone, J. GAGES-II: Geospatial Attributes of Gages for Evaluating Streamflow [Dataset]; U.S. Geological Survey: Reston, VA, USA, 2011. [Google Scholar] [CrossRef]
Veilleux, A.G. Bayesian GLS Regression for Regionalization of Hydrologic Statistics, Floods, and Bulletin 17 Skew. Master’s Thesis, Cornell University, Ithaca, NY, USA, 2009. Available online: https://ecommons.cornell.edu/bitstream/handle/1813/13819/Veilleux,%20Andrea.pdf?sequence=1 (accessed on 19 February 2015).
Marti, M.K.; Wavra, H.N.; Over, T.M.; Ryberg, K.R.; Podzorski, H.L.; Chen, Y.R. Peak Streamflow Data, Climate Data, and Results from Investigating Hydroclimatic Trends and Climate Change Effects on Peak Streamflow in the Central United States, 1920-2020 [Dataset]; U.S. Geological Survey: Reston, VA, USA, 2024. [Google Scholar] [CrossRef]
Dahl, T.E. Status and Trends of Prairie Wetlands in the United States 1997 to 2009; U.S. Department of the Interior, Fish and Wildlife Service, Ecological Services: Washington, DC, USA, 2014; p. 67. Available online: https://www.fws.gov/sites/default/files/documents/Status-and-Trends-of-Prairie-Wetlands-in-the-United-States-1997-to-2009.pdf (accessed on 28 August 2024).
Pierce, D.W.; Su, L.; Cayan, D.R.; Risser, M.D.; Livneh, B.; Lettenmaier, D.P. An extreme-preserving long-term gridded daily precipitation dataset for the conterminous United States. J. Hydrometeorol. 2021, 22, 1883–1895. [Google Scholar] [CrossRef]
Livneh, B.; Rosenberg, E.A.; Lin, C.; Nijssen, B.; Mishra, V.; Andreadis, K.M.; Maurer, E.P.; Lettenmaier, D.P. A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States: Update and extensions. J. Clim. 2013, 26, 9384–9392. [Google Scholar] [CrossRef]
McCabe, G.J.; Markstrom, S.L. A Monthly Water-Balance Model Driven by a Graphical User Interface; Open-File Report 2007–1088; U.S. Geological Survey: Reston, VA, USA, 2007. Available online: https://pubs.usgs.gov/of/2007/1088/pdf/of07-1088_508.pdf (accessed on 14 December 2007).
McCabe, G.J.; Wolock, D.M. Independent effects of temperature and precipitation on modeled runoff in the conterminous United States. Water Resour. Res. 2011, 47, W11522. [Google Scholar] [CrossRef]
U.S. Geological Survey. USGS Water Data for the Nation—U.S. Geological Survey National Water Information System Database [National Water Information System—Web Interface]; U.S. Geological Survey: Reston, VA, USA, 2024. [CrossRef]
Zambrano-Bigiarini, M.; Rojas, R. hydroPSO: Particle Swarm Optimisation, with Focus on Environmental Models (Version R Package Version 0.5-1) [Computer Software]. 2020. Available online: https://github.com/hzambran/hydroPSO (accessed on 23 January 2025).
Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
Kling, H.; Fuchs, M.; Paulin, M. Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. J. Hydrol. 2012, 424–425, 264–277. [Google Scholar] [CrossRef]
Seager, R.; Lis, N.; Feldman, J.; Ting, M.; Williams, A.P.; Nakamura, J.; Liu, H.; Henderson, N. Whither the 100th Meridian? The once and future physical and human geography of America’s arid–humid divide. Part I: The story so far. Earth Interact. 2018, 22, 1–22. [Google Scholar] [CrossRef]
Seager, R.; Feldman, J.; Lis, N.; Ting, M.; Williams, A.P.; Nakamura, J.; Liu, H.; Henderson, N. Whither the 100th Meridian? The once and future physical and human geography of America’s arid–humid divide. Part II: The meridian moves east. Earth Interact. 2018, 22, 1–24. [Google Scholar] [CrossRef]
Cleveland, W.S.; Grosse, E.; Shyu, W.M. Local regression models. In Statistical Models in S; Chambers, J.M., Hastie, T.J., Eds.; Chapman and Hall/CRC Press: New York, NY, USA, 1992; pp. 309–376. ISBN 0534167640. [Google Scholar]
Gebert, W.A.; Garn, H.S.; Rose, W.J. Changes in Streamflow Characteristics in Wisconsin as Related to Precipitation and Land Use; Scientific Investigations Report 2015–5140; U.S. Geological Survey: Reston, VA, USA, 2016. [Google Scholar] [CrossRef]
Gebert, W.A.; Krug, W.R. Streamflow trends in Wisconsin’s Driftless Area. J. Am. Water Resour. Assoc. 1996, 32, 733–744. [Google Scholar] [CrossRef]
Juckem, P.F.; Hunt, R.J.; Anderson, M.P.; Robertson, D.M. Effects of climate and land management change on streamflow in the driftless area of Wisconsin. J. Hydrol. 2008, 355, 123–130. [Google Scholar] [CrossRef]
Park, D.; Markus, M. Analysis of a changing hydrologic flood regime using the Variable Infiltration Capacity model. J. Hydrol. 2014, 515, 267–280. [Google Scholar] [CrossRef]
Potter, K.W. Hydrological impacts of changing land management practices in a moderate-sized agricultural catchment. Water Resour. Res. 1991, 27, 845–855. [Google Scholar] [CrossRef]
Levin, S.B. Peak streamflow trends in Wisconsin and their relation to changes in climate, water years 1921–2020, U.S. Geological Survey. In Ch. J of Peak Streamflow Trends and Their Relation to Changes in Climate in Illinois, Iowa, Michigan, Minnesota, Missouri, Montana, North Dakota, South Dakota, and Wisconsin; Scientific Investigations Report 2023–5064; Ryberg, K.R., Ed.; U.S. Geological Survey: Reston, VA, USA, 2024. [Google Scholar] [CrossRef]

Figure 1. Flowchart of analysis methods used in this paper.

Figure 2. Map of study region showing streamgages used in this analysis and trends, as measured by Kendall’s tau correlation, in (A) peak-flow and (B) annual maximum simulated daily discharge (ann_max_Q). The locations of two U.S. Geological Survey streamgages used in example analyses, 05436500 (Sugar River Near Brodhead, Wisconsin) and 03339000 (Vermilion River Near Danville, Illinois), are enclosed in square boxes and labeled with their station numbers.

Figure 3. Mann–Kendall time trend (A) tau and (B) p-values for peak flows and selected climate variables at study basins. In the boxplots, the bold mid-line of the boxes is the median; the lower and upper limits of the boxes (the “hinges”) are the first and third quartiles, respectively; and the upper (lower) whiskers extend to the largest (smallest) point no farther than 1.5*IQR beyond the upper (lower) hinge, where IQR is the inter-quartile range [ann_max_Q, annual maximum simulated daily discharge; ann_max_prcp, annual maximum daily precipitation; ann_tot_Q, annual total discharge; ann_tot_prcp, annual total precipitation].

Figure 4. Adjusted R² values of the least-squares single-station and location-model panel regressions with selected climate variables, peak-flow, and climate variable transformations, and approaches to water-balance model (MWBM) calibration. Boxplots portray the single-station results for each run and are ordered by the median of the single-station adjusted coefficient of determination (R²) values, with the largest median to the left. In the boxplots, the bold mid-line of the boxes is the median; the lower and upper limits of the boxes (the “hinges”) are the first and third quartiles, respectively; and the upper (lower) whiskers extend to the largest (smallest) point no farther than 1.5*IQR beyond the upper (lower) hinge, where IQR is the inter-quartile range. In the x-axis labels, the first segment of the string gives the peak-flow transformation, the second segment gives the climate variable and its transformation, and the third and final segment gives the approach to MWBM calibration [MMQR, method-of-moments quantile regression; ann_max_Q, annual maximum simulated daily discharge; ann_max_prcp, annual maximum daily precipitation; ann_tot_Q, annual total discharge; ann_tot_prcp, annual total precipitation].

Figure 5. Regression coefficients (A) and coefficient standard errors (B) of the log(peak) ~ sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)1/2] model. Notes: (1) Boxplot outliers (points beyond the limits of the whiskers) are not plotted to allow for better visualization of the remaining data. (2) In the boxplots, the bold mid-line of the boxes is the median; the lower and upper limits of the boxes (the “hinges”) are the first and third quartiles, respectively; and the upper (lower) whiskers extend to the largest (smallest) point no farther than 1.5*IQR beyond the upper (lower) hinge, where IQR is the inter-quartile range [MMQR, method-of-moments quantile regression].

Figure 6. Fixed effects from the log(peak)~sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] method-of-moments quantile regression (MMQR) model versus their robust standard errors: (A) the location effect

a_{i}^{P}

and (B) the scale effect

δ_{i}

.

Figure 6. Fixed effects from the log(peak)~sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] method-of-moments quantile regression (MMQR) model versus their robust standard errors: (A) the location effect

a_{i}^{P}

and (B) the scale effect

δ_{i}

.

Figure 7. Fixed effects from the log(peak)~sqrt(ann_max_Q) [(annual maximum of modeled daily discharge)^1/2] method-of-moments quantile regression (MMQR) model versus selected static basin characteristics: (A) the location effect

a_{i}^{P}

east of the -98th meridian; (B) the location effect

a_{i}^{P}

west of the -98th meridian; (C) the scale effect

δ_{i}

east of the -98th meridian; and (D) the scale effect

δ_{i}

west of the -98th meridian [lm, linear model (least-squares regression fit); Ktau, Kendall’s tau; Ktau_p, p-value of Kendall’s tau].

Figure 7. Fixed effects from the log(peak)~sqrt(ann_max_Q) [(annual maximum of modeled daily discharge)^1/2] method-of-moments quantile regression (MMQR) model versus selected static basin characteristics: (A) the location effect

a_{i}^{P}

east of the -98th meridian; (B) the location effect

a_{i}^{P}

west of the -98th meridian; (C) the scale effect

δ_{i}

east of the -98th meridian; and (D) the scale effect

δ_{i}

west of the -98th meridian [lm, linear model (least-squares regression fit); Ktau, Kendall’s tau; Ktau_p, p-value of Kendall’s tau].

Figure 8. Comparisons of adjusted and observed moments of log-transformed peak flows (in cubic feet per second [ft³/s]) based on the log(peak) ~ sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] method-of-moments quantile regression (MMQR) model: (A) mean; (B) standard deviation; (C) skewness. Note: “median change” indicates median of the adjusted minus observed values for each basin, and “fraction increasing” indicates the fraction of moment values that increase due to adjustment, i.e., the fraction of adjusted minus observed values that are positive.

Figure 9. Comparisons of adjusted and observed quantiles of log-transformed peak flows based on the log(peak) ~ sqrt(ann_max_Q) method-of-moments quantile regression (MMQR) model: (A) quantiles with annual exceedance probability (AEP) of 0.5; (B) quantiles with AEP of 0.1; (C) Quantiles with AEP of 0.04; (D) quantiles with AEP of 0.01. Notes: (1) The quantiles were estimated by linear interpolation between the plotting positions of the log-transformed peak-flow values, where the plotting positions were computed as ((k − 1/3))⁄((n + 1/3)), where k is the rank and n is the number of values; (2) “median adj/obs ratio” indicates the median of the ratios of the adjusted to observed quantile for each basin, and “fraction increasing” indicates the fraction of quantiles that increase after adjustment [ft³/s, cubic feet per second].

Figure 10. Trends characterized by Kendall’s tau (Ktau) before (observed) and after adjustment: (A) adjusted peak flow versus observed trends; (B) change in peak-flow trend (adjusted-observed) versus observed peak-flow trend; and (C) boxplots of trends [obs peak, observed peak; adj peak, adjusted peaks; ann_max_Q, annual maximum simulated discharge]. In the boxplots, the bold mid-line of the boxes is the median; the lower and upper limits of the boxes (the “hinges”) are the first and third quartiles, respectively; and the upper (lower) whiskers extend to the largest (smallest) point no farther than 1.5*IQR beyond the upper (lower) hinge, where IQR is the inter-quartile range.

Figure 11. Adjustment of peak flows to 2018 climate conditions at U.S. Geological Survey streamgage 03339000, Vermilion River near Danville, Illinois, using method-of-moments quantile regression (MMQR): (A) time series of observed climate variable ann_max_Q^1/2 (annual maximum simulated daily discharge, square root-transformed) and fitted trend line; (B) time series of observed peak flows and predicted quantiles; (C) scatterplot of observed peak flows versus climate with fitted quantile regression lines; (D) time series of observed and adjusted peak flows with fitted trend lines; (E) adjusted versus observed peak flows; and (F) empirical cumulative distribution functions of observed and adjusted peak flows.

Figure 12. Adjustment of peak flows to 2018 climate conditions at U.S. Geological Survey streamgage 05436500, Sugar River near Brodhead, Wisconsin, using method-of-moments quantile regression (MMQR): (A) time series of observed climate variable ann_max_Q^1/2 (annual maximum simulated daily discharge, square root-transformed) and fitted trend line; (B) time series of observed peak flows and predicted quantiles; (C) scatterplot of observed peak flows versus climate with fitted quantile regression lines; (D) time series of observed and adjusted peak flows with fitted trend lines; (E) adjusted versus observed peak flows; and (F) empirical cumulative distribution functions of observed and adjusted peak flows.

Table 1. Quantiles of station record length and of selected properties of basins used in this study [sq. km., square kilometers; PET, potential evapotranspiration; m, meters; min., minimum; max., maximum].

Quantile	Years of Record	Drainage Area (sq. km.)	Mean Annual Precipitation (MAP) (mm)	Aridity = PET/MAP	Mean Elevation (m)
0% (min.)	38	28.0	317	0.249	145
5%	45	107	419	0.559	197
10%	47	163	496	0.657	217
25%	48	485	649	0.727	274
50%	69	1116	824	0.791	349
75%	73	2501	913	0.917	503
90%	93.1	4688	1062	1.281	1661
95%	97	6808	1123	1.497	2182
100% (max.)	98	38,732	1755	1.964	3137

Table 2. Water-balance model MWBM adjustable parameter definitions and ranges of values [degC, degrees Celsius; mm, millimeters].

Parameter Name	Definition	Range of Values Considered in This Study	Minimum Calibrated Value	Median Calibrated Value	Maximum Calibrated Value
Train	Temperature above which all precipitation is rain	0 to 10 degC	0	4.26	10
Tsnow	Temperature below which all precipitation is snow	−10 to 0 degC	−10	−1.60	0
Melt factor	Upper bound of fraction of snow storage that can melt each day	0.01 to 0.2	0.01	0.154	0.2
STC	Soil water storage capacity	20 to 500 mm	20	329	500
Drofrac	Fraction of rain that runs off directly, i.e., during the same day as it occurs.	0 to 0.50	0	0.0132	0.282
Rofrac	Fraction of excess soil moisture that runs off each day.	0.01 to 0.3	0.01	0.0851	0.3

Table 3. Fractions of significance classes of single-station Mann–Kendall time trend tau values [ann_max_Q, annual maximum daily discharge simulated by MWBM model; ann_max_prcp, basin-average annual maximum daily precipitation; ann_tot_Q, annual total discharge simulated by MWBM model; ann_tot_prcp, basin-average annual total precipitation; |, given]. Note: There are two columns for each set of trends, one for those with trend tau values ≤ 0 and another for those with trend tau values > 0. Therefore, it is the total fraction in each pair of columns that sums to 1.

	Peak Flow vs. Water Year\| trend ≤ 0	Peak Flow vs. Water Year\| trend > 0	ann_max_Q vs. Water Year\| trend ≤ 0	ann_max_Q vs. Water Year\| trend > 0	ann_max_prcp vs. Water Year\| trend ≤ 0	ann_max_prcp vs. Water Year\| trend > 0	ann_tot_Q vs. Water Year\| trend ≤ 0	ann_tot_Q vs. Water Year\| trend > 0	ann_tot_prcp vs. Water Year\| trend ≤ 0	ann_tot_prcp vs. Water Year\| trend > 0
p ≤ 0.05	0.079	0.161	0.012	0.348	0.009	0.139	0.012	0.412	0.012	0.418
p > 0.05	0.352	0.409	0.133	0.506	0.267	0.585	0.109	0.467	0.079	0.491
Total	0.430	0.570	0.145	0.855	0.276	0.724	0.121	0.879	0.091	0.909

Table 4. Kendall’s tau correlation among Mann–Kendall time trend tau values for each basin [ann_max_Q, annual maximum daily discharge simulated by MWBM model; ann_max_prcp, basin-average annual maximum daily precipitation; ann_tot_Q, annual total discharge simulated by MWBM model; ann_tot_prcp, basin-average annual total precipitation].

	Peak Flow	ann_max_Q	ann_max_prcp	ann_tot_Q	ann_tot_prcp
peak flow	1	0.284	0.153	0.160	0.148
ann_max_Q	0.284	1	0.440	0.606	0.550
ann_max_prcp	0.153	0.440	1	0.331	0.337
ann_tot_Q	0.160	0.606	0.331	1	0.769
ann_tot_prcp	0.148	0.550	0.337	0.769	1

Table 5. Single-station and panel regression coefficients and related statistics of the log(peak)~sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] model [Q1, first quartile; SE, standard error; Q3, third quartile; MMQR, method-of-moments quantile regression; q(tau), quantiles of the MMQR noise variable U; lm/plm, single-station and panel least-squares linear models; NA, not applicable].

tau (1)	Single-Station Mean	Single-Station Q1	Single-Station Median	Single-Station Median SE	Single-Station Median t Statistic	Single-Station Q3	Single-Station Coeff .v0_SE_ frac2 (2)	Single-Station Coeff. vMMQR_SE_ frac2 (3)	MMQR Coeff	MMQR Coeff Robust SE	MMQR Coeff t Statistic	q (tau)
lm/plm	0.848	0.548	0.751	0.118	6.4	1.017	0.985	0.488	0.668	0.017	38.2	NA
MMQR scale	NA	NA	NA	NA	NA	NA	NA	NA	−0.0782	0.0066	NA	NA
0.002	NA	NA	NA	NA	NA	NA	NA	NA	1.134	0.049	23.0	−5.33
0.01	NA	NA	NA	NA	NA	NA	NA	NA	0.964	0.030	31.8	−3.51
0.04	1.074	0.651	0.911	0.223	4.1	1.311	0.800	0.636	0.863	0.022	39.4	−2.35
0.1	1.000	0.619	0.872	0.215	4.1	1.212	0.867	0.579	0.802	0.019	42.2	−1.63
0.25	0.907	0.559	0.819	0.165	5.0	1.115	0.936	0.555	0.735	0.018	41.8	−0.83
0.5	0.838	0.524	0.769	0.134	5.8	1.015	0.955	0.530	0.665	0.017	38.6	0.04
0.75	0.760	0.469	0.667	0.133	5.0	0.934	0.921	0.464	0.600	0.017	34.5	0.84
0.9	0.686	0.403	0.584	0.184	3.2	0.836	0.773	0.488	0.542	0.019	29.3	1.54
0.96	0.666	0.362	0.564	0.196	2.9	0.852	0.664	0.506	0.494	0.021	23.6	2.13
0.99	NA	NA	NA	NA	NA	NA	NA	NA	0.423	0.027	15.6	2.93
0.998	NA	NA	NA	NA	NA	NA	NA	NA	0.333	0.038	8.8	4.00

Notes: (1) In the column with header “tau,” the numerical values indicate tau values, and the data in those rows are for the quantile indicated by the tau value. The top row contains ordinary least-squares single-station and panel regression location results; the second row gives panel regression scale results. (2) The column headed “Single-station coeff.v0_SE_frac2” contains the fraction of single-station coefficients whose absolute values exceed their standard error by at least a factor of 2. This fraction approximates the fraction of single-station coefficients for which a t-test with the null hypothesis that the coefficients are 0 has p < 0.05. (3) The column headed “Single-station coeff.vMMQR_SE_frac2” contains the fraction of single-station coefficients for which the absolute values of the differences between the single-station and MMQR coefficients exceeds the single-station standard error by at least a factor of 2. This fraction approximates the fraction of single-station coefficients for which a t-test with the null hypothesis that the single-station and MMQR coefficients are the same has p < 0.05.

Table 6. Statistics of weighted quantile regression deviations for regressions using the log(peak) ~ sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] model [MMQR, method-of-moments quantile regression; LOOCV, leave-one-out cross-validation].

	Statistic	tau0.04	tau0.1	tau0.25	tau0.5	tau0.75	tau0.9	tau0.96
Single-station	Min.	0	0	0	0	0	0	0
MMQR	Min.	6.89 × 10⁻⁷	2.18 × 10⁻⁶	4.46 × 10⁻⁶	1.93 × 10⁻⁵	5.73 × 10⁻⁶	7.70 × 10⁻⁸	2.18 × 10⁻⁶
MMQR-LOOCV	Min.	2.89 × 10⁻⁶	3.09 × 10⁻⁵	3.76 × 10⁻⁵	3.82 × 10⁻⁵	3.31 × 10⁻⁵	5.33 × 10⁻⁶	1.65 × 10⁻⁶
Single-station	Q1	0.0178	0.0318	0.0504	0.0581	0.0473	0.0303	0.0166
MMQR	Q1	0.0214	0.0373	0.0612	0.0695	0.0563	0.0346	0.0190
MMQR-LOOCV	Q1	0.0360	0.0719	0.1327	0.1599	0.1130	0.0542	0.0241
Single-station	Median	0.0316	0.0616	0.1110	0.1388	0.1080	0.0596	0.0297
MMQR	Median	0.0358	0.0692	0.1243	0.1552	0.1177	0.0637	0.0318
MMQR-LOOCV	Median	0.0619	0.1312	0.2541	0.3295	0.2467	0.1181	0.0506
Single-station	Mean	0.0454	0.0943	0.1694	0.2073	0.1605	0.0861	0.0405
MMQR	Mean	0.0553	0.1065	0.1852	0.2238	0.1717	0.0939	0.0468
MMQR-LOOCV	Mean	0.0990	0.1812	0.3096	0.3931	0.3464	0.2355	0.1493
Single-station	Q3	0.0517	0.1095	0.2108	0.2704	0.2055	0.1047	0.0491
MMQR	Q3	0.0583	0.1208	0.2308	0.2930	0.2210	0.1113	0.0521
MMQR-LOOCV	Q3	0.0908	0.1994	0.4030	0.5450	0.4742	0.2661	0.1093
Single-station	Max.	2.62	3.49	4.97	4.28	2.49	2.16	1.40
MMQR	Max.	4.05	4.72	4.86	4.21	2.65	2.92	2.88
MMQR-LOOCV	Max.	7.47	7.75	7.14	5.26	2.86	2.99	2.93

Note: The smallest (best) weighted quantile regression deviations statistic value is presented in bold type.

Table 7. Statistics of monotonicity fractions for regressions with the log(peak) ~ sqrt(ann_max_Q) [(annual maximum of simulated daily discharge)^1/2] model [MMQR, method-of-moments quantile regression; LOOCV, leave-one-out cross-validation; NA, not applicable].

	By Quantile (tau)											By Observation
	0.002	0.01	0.04	0.1	0.25	0.5	0.75	0.9	0.96	0.99	0.998	Minimum	Mean
Single-station	NA	NA	1	0.979	0.993	0.996	0.996	0.991	0.980	NA	NA	0.286	0.9907
MMQR	1	0.9985	0.9989	0.9990	0.9992	0.9993	0.9993	0.9993	0.9993	0.9992	0.9993	0.0909	0.9992
MMQR-LOOCV	1	0.9982	0.9988	0.9989	0.9992	0.9992	0.9993	0.9993	0.9993	0.9992	0.9993	0.0909	0.9991

Note: The monotonicity fraction is always 1 for the smallest quantile considered because of the method of computation (refer to Section 2.3, for more information), and thus should be ignored.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Over, T.; Marti, M.; Podzorski, H. Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends. Hydrology 2025, 12, 119. https://doi.org/10.3390/hydrology12050119

AMA Style

Over T, Marti M, Podzorski H. Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends. Hydrology. 2025; 12(5):119. https://doi.org/10.3390/hydrology12050119

Chicago/Turabian Style

Over, Thomas, Mackenzie Marti, and Hannah Podzorski. 2025. "Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends" Hydrology 12, no. 5: 119. https://doi.org/10.3390/hydrology12050119

APA Style

Over, T., Marti, M., & Podzorski, H. (2025). Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends. Hydrology, 12(5), 119. https://doi.org/10.3390/hydrology12050119

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Regional Analysis of the Dependence of Peak-Flow Quantiles on Climate with Application to Adjustment to Climate Trends

Abstract

1. Introduction

2. Materials and Methods

2.1. Single-Station Regression

2.2. Panel–Quantile Regression

2.3. Goodness-of-Fit Statistics

2.4. Computation of Temporal Trends

2.5. Station Selection and Peak-Flow Data

2.6. Computation of the Climate Predictors

3. Results

3.1. Temporal Trends

3.2. Goodness of Fit and the Selection of Climate Predictors and Regression Transformations

3.3. Regression Coefficients and Fixed Effects

3.4. Goodness of Fit and Comparison of Modeling Approaches

3.5. Results Summary

4. Applications

4.1. General Considerations

4.2. Estimation via Adjustment

4.2.1. Method and Study-Wide Results

4.2.2. Example Basin: Vermilion River near Danville, Illinois

4.2.3. Example Basin: Sugar River near Brodhead, Wisconsin

5. Discussion and Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI