Characterization of Export Regimes in Concentration–Discharge Plots via an Advanced Time-Series Model and Event-Based Sampling Strategies

: Currently, the export regime of a catchment is often characterized by the relationship between compound concentration and discharge in the catchment outlet or, more speciﬁcally, by the regression slope in log-concentrations versus log-discharge plots. However, the scattered points in these plots usually do not follow a plain linear regression representation because of different processes (e.g., hysteresis effects). This work proposes a simple stochastic time-series model for simulating compound concentrations in a river based on river discharge. Our model has an explicit transition parameter that can morph the model between chemostatic behavior and chemodynamic behavior. As opposed to the typically used linear regression approach, our model has an additional parameter to account for hysteresis by including correlation over time. We demonstrate the advantages of our model using a high-frequency data series of nitrate concentrations collected with in situ analyzers in a catchment in Germany. Furthermore, we identify event-based optimal scheduling rules for sampling strategies. Overall, our results show that (i) our model is much more robust for estimating the export regime than the usually used regression approach, and (ii) sampling strategies based on extreme events (including both high and low discharge rates) are key to reducing the prediction uncertainty of the catchment behavior. Thus, the results of this study can help characterize the export regime of a catchment and manage water pollution in rivers at lower monitoring costs. We propose a simple stochastic time-series model to represent the export regime of a catchment beyond simple regression. We propose how to get the required data with the least effort when the use of high-frequency in situ analyzers is not feasible or restricted. Sampling strategies based on extreme events are essential for reducing the prediction uncertainty of the catchment behavior.


Introduction
The way a catchment releases and metabolizes a compound defines the export regime (or behavior) of the catchment regarding this compound. Export regimes can be classified as chemodynamic when concentration varies with discharge (positively or negatively), and as chemostatic when the concentration is not affected by discharge [1][2][3][4][5][6][7]. Chemodynamic export regimes can be classified as dilution behavior and mobilization behavior. During rainfall, a catchment demonstrates dilution behavior if the compound concentration is diluted with high discharge. Contrarily, the catchment shows mobilization behavior if the compound concentration increases with discharge. This can occur for compounds mobilized by fast runoff components (e.g., nitrate by interflow, particle-bound pollutants by surface runoff) [7]. On the other hand, a chemostatic export regime presents a washout of pollutants at a relatively constant concentration (the compound concentration varies only slightly with discharge).
One can characterize the export regime of an entire catchment by collecting water samples in the catchment river outlet together with river discharge measurements. However, budgetary restrictions usually limit field campaigns. For this reason, it is crucial to design sampling strategies effectively. Sampling strategies can follow either high-or low-frequency approaches. Low-frequency series (samples by week or month) have commonly been used to characterize the concentration-discharge (C-Q) relationship of a catchment [4,[7][8][9]. For example, Godsey et al. [4] used hydrochemical data of 59 sites, with more than 30 years of data, but with only five to seven water samples a year. Musolff et al. [7] used 16-year time series of monthly and bi-monthly samples at gauging stations to predict export behavior. In comparison to low-frequency sampling, high-frequency sampling (e.g., hourly samples) decreases the uncertainty of export estimates [10] and water quality parameters [11]. Some examples of studies that use high-frequency sampling are Bowes et al. [12], Evans and Davies [13], Floury et al. [14], Grimaldi et al. [15], Jones et al. [16], Liu et al. [17], Ockenden et al. [18], Rusjan et al. [19], and Schwientek et al. [20]. Nevertheless, both low-and high-frequency sampling approaches can provide insightful information on the catchment, e.g., [3,21].
Export regimes of a catchment are commonly described by a simple linear regression slope of log-concentrations (log 10 C) versus log-discharges (log 10 Q) measured in the river at the catchment outlet. The slope value of this power-law relationship defines the export regime of the catchment. A slope of negative value corresponds to dilution-type catchments, and a slope of zero corresponds to the chemostatic type [4], whereas the mobilization type shows a positive relationship between log 10 Q and log 10 C [3][4][5][6][7]. Therefore, linear regression can provide a beneficial water quality metric via the slope of log 10 C-log 10 Q.
However, the catchment has memory of its past due to the involved time scales of runoff formation, contaminant release, and reactive catchment-scale transport [13]. Many studies look at the effects of hysteresis on compound exports [22][23][24][25][26]. Hysteresis means that concentrations depend not only on discharge but also on showing an apparent dependency on past concentration values. Several studies capture hysteresis metrics to explain hysteresis loops produced during storms [27][28][29][30]. Thus, many catchments show temporal hysteresis that defies linear regression assumptions [31]. These hysteresis effects profoundly contradict the "residuals are independent errors" assumption that forms the basis of classical least squares regression. In simpler words, the remaining scatter of data points around the regression line should at least be uncorrelated with zero mean and a constant variance. Yet, temporal hysteresis means that the residuals are autocorrelated in time. With these assumptions violated, regression results can be suboptimal and biased, as well as foster incorrect conclusions.
To overcome the limitations of linear regression, different alternatives have been developed. For example, some methods are able to model C-Q relationships that vary with time, discharge, and season. One of these approaches is the WRTDS model [32], which is extensively used by the United States Geological Survey (USGS). The WRTDS model can analyze long-term surface-water-quality datasets by using weighted regression of concentrations on time, discharge, and season. Thus, WRTDS aims at interpolating low-frequency concentration time series (often monthly) at the scale of discharge time series (often daily), but not the characterization of export regimes. Also, the sample collection period must be at least 20 years, and a minimum of 200 samples is needed. These conditions are seldom met in most water-quality time series.
A modified and improved version of WRTDS, which gets better predictions of riverine concentration and flux, was developed by Zhang et al. [33] and Zhang & Ball [34]. Another more recent alternative methodology is the two-sided affine power scaling relationship by Tunqui et al. [35], which produced better results than the linear regression.
Underwood et al. [36] applied Bayesian inference to estimate segmented regression model parameters and identify the export regime. Qian et al. [37] showed that a Bayesian approach could improve the predictions of nutrient loads.
Other recent studies have taken advantage of high-frequency time series to study the C-Q relationship. For example, Bieroza & Heathwaite [22] studied the variability of the phosphorous C-Q relationship by applying fuzzy logic models, and looked into the hysteresis of the phosphorous export.
To provide information about the processes regulating the export, many recent studies have shown that sampling strategies based on events or seasonality are deeply advantageous [5,6,22,30,34,36,[38][39][40][41][42][43]. These event studies were focused on discharge time series (e.g., recession time series) such as in Jachens et al. [39], as well as combined with solute time series (e.g., C-Q studies) such as in Knapp et al. [5]. For example, Minaudo et al. [40] introduced a new model to account for different temporal scales that correspond to fast and slow events (storms and seasonal, respectively). Their results showed that apparent C-Q relationships could not only be different, but even opposite depending on the scale considered. Other recent studies used autoregressive time-series models to fit high-frequency data to decipher the dynamics of compound exports [16,18].
Nevertheless, hysteresis effects can only be detected when high-frequency observations are taken, since they are overlooked with low-frequency strategies [6,13,44]. Additionally, high-frequency measurements can be resource intensive (labor and economically) and challenging to apply at many field sites, especially during high-flood events [5].
Therefore, in this manuscript, we propose a new method to model C-Q relationships beyond simple regression. Additionally, we determine which observation strategies produce better predictions with the least effort (i.e., with fewer samples) when the use of high-frequency in situ analyzers is not available or restricted. For this purpose, we introduce a simple stochastic time-series model (regime-and-memory model, RMM) for concentrations in the river that accounts for fluctuating release and transport with memory, using an autocorrelation over time. One explicit parameter of our model represents the export regime. This parameter can morph the model among chemostatic-type and chemodynamic-type catchment behaviors, and it resembles the regression slope in plots of log 10 C versus log 10 Q. To search for the best sampling strategies, we applied retrospective optimal design of experiments (ODE), as in [45]. In our specific case, retrospective means thinning out an exhaustive time series from high-frequency in situ analyzers to only a few samples.
Overall, the contributions of our work are fourfold: (i) We introduce a simple stochastic time-series model to characterize the export regime of a catchment subject to hysteresis; (ii) We demonstrate the robustness of our model even with only small data sets; (iii) We explore how many C-Q samples can be enough to characterize the export regime of a catchment sufficiently well; and (iv) We recommend sampling strategies that optimize the characterization of the export regime with the least sampling effort. For illustration purposes, we use a high-frequency data series of collected nitrate concentrations. They were collected with an in situ analyzer in the Ammer catchment in southwestern Germany. The available data [20] span a total period of nine months, with hourly observations. They include nitrate concentrations and discharge rates with over 6500 measurements in total.
Accordingly, this paper is organized as follows: In Section 2, we first explain our stochastic time-series model. Then, we explain how to infer the parameters of our model from the whole time series and how we thin out data to reduce them towards different sampling strategies. In Section 3, we introduce the observed data and catchment used for this study. In Sections 4 and 5, we present and discuss, respectively, our results. Finally, we disclose our conclusions in Section 6.

Methodology
Throughout this study, we will consider −log 10 C versus log 10 Q in the following, instead of log 10 C for mathematical convenience in model building. The slope in corre-sponding plots is α in Equation (1), the parameter that describes the export regime in our model. For this study, we disregard a negative relation α (mobilization behavior) between log 10 Q and −log 10 C, since the Ammer catchment misses this component as seen in Schwientek et al. [20] and Section 3. However, both the model and methodology can cover the full range of α (chemostatic, dilution, and mobilization behavior).

Regime and Memory Model (RMM)
To represent the evolution of concentration versus time, we use a simple stochastic time-series model for nitrate in the river. Our model has an explicit morphing parameter α. This parameter allows a transition of the model between chemostatic-type and chemodynamic-type behavior of the headwater catchment: In Equation (1) and throughout this manuscript, we refer to γ (=−α) as the export , Q is the discharge rate (∈ + , L 3 /T ), and k is the apparent release rate ∈ + , M L 3 The subscript t indicates that parameters/variables depend on the current time t.
When α ≈ 0 (γ ≈ 0), the catchment shows a chemostatic export regime, e.g., concentration only depends on the apparent release rate, which then has units of a (fixed) concentration M/L 3 . When α is positive (γ < 0), the catchment shows a dilution behavior, presenting an anti-proportional dependence on the discharge rate. In the limit case of α = 1 (γ ≈ −1), k assumes units of a mass flow rate [M/T].
To ensure that k is never negative, we assume that k follows a lognormal distribution with mean m k and variance v k . Furthermore, we assume that k t follows (over time) an auto-regressive model of order one (AR(1), [46]) in order to account for the dominant characteristic time scale of release and chemical turnover in the catchment. Thus, the AR(1) model is commensurate with the source strength and includes effects such as hysteresis or event-to-event variations of source availability.
Setting log(k) = Y, we can define an AR(1) for Y as: where λ ∈ [0, 1) is a correlation parameter to account for the correlation between concentrations at the current time and previous times; c is a constant that is responsible for the mean; and ε is a white-noise time series, following a normal distribution with mean equal to 0 and a standard deviation of σ ε , N(0, σ ε ). The AR(1) is a stochastic model with mean and variance given by: As seen in Mehne and Nowak [47], the characterization of an AR(1) in calibration/inferences is made more convenient and intuitive by defining a characteristic time t ch , and the mean and variance of k, respectively, m k and v k . The characteristic time is defined as and the relationships between m Y , v Y and m k , v k are the well-known moment relations for the lognormal distribution [48]: Equations (6) and (7) allow the adjustment of log(AR(1)) into more understandable units of k, i.e., [M/L 3 ] or [M/T] rather than for Y = log(k). Therefore, with only four parameters (α, m k , v k , and t ch ), we can represent concentrations versus time.

Bayesian Parameter Inference of Full Time-Series Data
Equations (1)-(7) define a stochastic time-series model for concentration, i.e., a model that can generate many random time series. Its relevant properties can be controlled by the choice of values for α, t ch , m k , v k , and by using a given time series for Q t . Thus, the next step is to fit the model to the observed data of concentration C t . Due to the stochastic randomness implied for C t by k t (explicit by the white noise ε t in Equation (2)), this is achieved by choosing parameter values such that the resulting model is most likely to generate good-fitting random C t time series. This parameter choice is subject to a list of plausibility arguments for admissible parameter values.
The framework most suitable for this task is Bayesian parameter inference, as in [49]. In the heart of Bayesian inference lies Bayes' theorem where θ denotes the parameters to be inferred (α, t ch , m k , v k ) and y denotes the measurement data (Q, C). The prior p(θ) represents our belief (or knowledge) about the parameters before seeing any data and has to be chosen by us (Table 1). The first term on the right-hand side, p(y|θ), is the likelihood of observing the data for the given parameters. The term in the denominator, p(y), is independent of the parameters θ, and therefore does not need to be computed. The result is the posterior distribution p(θ|y), which expresses our updated (calibrated) belief about the parameters after seeing the data y. Following this, we choose a prior distribution for our parameters and derive an expression for the likelihood. After that, we explain how to obtain a sample from the posterior distribution via Markov Chain Monte Carlo (MCMC) sampling.

Prior Distributions
Bayesian parameter inference treats unknown parameter values as random variables and formulates plausibility arguments as prior probability distribution functions (PDF) ( Table 1). Then, it updates these distributions to posterior distributions based on the available observation data.
We assumed that we were clueless about the export regime α and we only discarded mobilization behavior as justified in Section 3. Thus, we adopted a uniform distribution between the two extreme values of chemostatic and pure dilution behavior (α ∈ [0,1]).
For the characteristic time t ch , we assumed that concentrations in the river present roughly weekly cycles (e.g., due to weekly cycles in the release of wastewater treatment plants, which are in turn triggered by cycles in known activities) with a mean equal to four days (m tch = 95.7 h) and a standard deviation of three days (s tch = 71.7 h). Thus, the 95% confidence interval was about [1 d, 14 d]. Because t ch must be positive, we chose once again a lognormal distribution. The corresponding parameters of the lognormal distribution LN(µ, σ) followed from respective copies of Equations (6) and (7), resulting in µ tch = 4.35 and s tch = 0.65.
We also assumed little prior information for the mean m k , and variance v k of the apparent release rate k, so we chose uniform distributions. We will discuss the mean m k first. For α = 0, it corresponds to the long-term mean of nitrate concentrations. For an average discharge rate of 0.87 m 3 /s (Section 3), if we rounded this to ≈1 m 3 /s, then concentration C = k (Equation (1)) for both extreme values of α. Therefore, we can interchange C and k in the following discussion.
We assumed that the expected values of k (e.g., m k ) within the river were between pristine water (≈1 mg/L) and the effective average of the undiluted, agriculturally shaped groundwater in the Ammer catchment (≈40 mg/L, as observed as integral catchment output during baseflow by Schwientek et al. [20]). Finally, we will discuss the standard deviation √ v k of k. For α = 0, it controls the amplitude of fluctuations in nitrate concentrations. We assumed a uniform distribution between 1 and 6.32, because we could expect both an almost constant k and a dynamic k.

Likelihood
To compute the likelihood, we used the following semi-analytical method. It calculates so-called depreciated increments between Y t and Y t−l , where l is the lag distance along time between two observations. For example, the data we used comprehended 6604 hourly samples. Between the third and the first samples, the lag time was 2. Therefore, if the whole time series is taken into account, then the lag time between two observations is l = 1. Thinning out to half the sampling rate corresponds to l = 2. Irregular sampling triggers individual lag values between consecutive data values.
We assumed that θ = (α, t ch , m k , and v k ) were currently proposed trial values. We then wanted to compute the likelihood for data from the time series y = (Q, C) for the trial values θ. Note that y can include the whole high-frequency series of data or only a short list of observations at various lag times.
Using Equation (1), we could transform these two time series of Q and C into a time series of k t . Then, by applying the log, we obtained a time series of Y t . Next, we calculated λ = 1/t ch (Equation (5)), and then c and σ ε from m Y and v Y . We did so by first solving Equations (6) and (7), and then Equations (3) and (4). Before computing the depreciated increment between two observations, we wanted to get a Y 0 zero-mean AR(1); therefore, we removed the mean m Y (Equation (3)) from Y: Then, the depreciated increments X l can be computed: Equation (10) shows that the X l are mutually independent, as they are sums of nonoverlapping segments from the white-noise series ε t . The variance and corresponding standard deviation of the depreciated increment are, respectively: After the rearrangements, the likelihood p(y|θ) is now the probability density value of X l , which in turn is given by the multivariate normal distribution: where n t is the total number of samples of the considered series (here n t = 6604 when using l = 1). In Equation (13), we can use the simple product rule due to independence within the time series of X l . Finally, the likelihood term p( X l |θ) in Equation (13) is given by which is the normal distribution for X l implied by the chosen priors and the above equations.

Sampling from the Posterior
With the prior and likelihood in place, we could then sample from the posterior distribution via MCMC sampling. Here, we explicitly coded MCMC in Matlab [50]. The scaling factor for the MCMC proposal was constructed as an adjustable fraction of the posterior standard deviation to get an acceptance rate of approximately 27% (the optimal acceptance ratio for four dimensions according to Gelman et al. [51]). Since the posterior standard deviation is only known after the MCMC is run at least once, we used a burn-in run of the MCMC to obtain good scaling factors for the productive final part of the MCMC.
The posterior marginal PDF of α, p(α|y d ), is again given by Bayes' Theorem [49]: where p(y d |α, θ −α , X l ) is the likelihood of the observed data given α, the remaining parameters θ −α and X l , and p(α, θ −α , X l ) is the prior of the parameters and X l . For simpler discussion, we reduced the sample to a point-estimate by computing the posterior mean from the posterior samples: where N is the length of the MCMC. We used N = 10,000 and N = 100,000 for the burn-in and the productive part of the MCMC, respectively. Finally, we could compute an inferred truth value α full from the full data set, and values α post,d for various sampling designs d.

Retrospective Optimal Design of Experiments (ODE)
The objective of our optimal design was to minimize the error ∅ in estimating the parameter α: where D is a set of considered design strategies, d is any considered design, d opt is the best design among D, and ∅ α (d) is the squared error in estimating α for design d (as α post,d ) when compared to α full : In summary, when we thinned out the time-series data, only including data as specified by design d, we inferred α and hoped that this α post,d would remain close to α full (α inferred with full time-series data). We also assessed whether α post,d had little remaining uncertainty as expressed by intervals in its distribution p(α|y d ) according to Equation (15).
We propose 20 designs based on four sampling strategies (Table 2). Firstly, we thinned out the observed data based on time-frequency strategies (d = 1 to 4). Secondly, we thinned out based on discharged volume within the river (d = 5 to 8). In both strategies, sampling was uniform (in time or discharged water), and we varied the number of samples between 5 and 40. Thirdly, we applied event-based sampling strategies, in which we considered samples at low discharge rates (d = 9 to 12) and high ones (d = 13-16). The last strategy considered low and high discharge rates together (d = 17 to 20). Here, we also considered between 5 and 40 samples. Designs should not be understood as random selections but manual selections as described in Appendix B. In this same appendix, the reader can find the observation values and the exact time coordinates for each sampling strategy. Low Q Samples are taken at low discharge rates (Q < 0.67 m 3 /s) that belong to different peak events (uncorrelated samples). We choose them equidistantly. 13

High Q
Samples are taken at high discharge rates (Q > 2.5 m 3 /s) for one peak event (correlated samples). When no more observations are available of one peak, we proceed to grab samples of a second peak event and so on.  For this study, we considered a low discharge rate to be lower than 0.67 m 3 /s (which corresponds to a value 25% below the mean discharge of the whole time series) and a high discharge rate to be Q > 2.5 m 3 /s.

Catchment and Data
The data used in this study were already published in 2013 [20]. The data were observed in the gauged catchment of the Ammer River located in the state of Baden-Württemberg in southwest Germany. The area of the catchment upstream of the gauging station Pfäffingen is 134 km 2 . The geology is dominated by limestone of the Middle Triassic ("Oberer Muschelkalk") and gypsum-bearing mudstone of the Upper Triassic ("Gipskeuper"). Both formations, particularly the limestone, are strongly karstified, and the Ammer River is primarily fed by karst springs, most of them being situated close to the main stem river [52]. In line with the large storage capacity of the karst system, the permeable rocks of the catchment lead to a dampened hydrologic variability with a very strong and steady baseflow. Mean annual low flow (0.44 m 3 /s) is as high as 50% of the long-term average flow (0.87 m 3 /s) (retrievable from https://www.hvz.baden-wuerttemberg.de/, accessed on 14 May 2020). Nevertheless, pronounced and sharp flood peaks typically occur during the summer season. They are caused by surface runoff generation on impervious urban areas in the upper catchment [20,53]. Usually, due to steep recessions, baseflow is attained within a few hours after the flood peaks. In total, 17% of the catchment area is urbanized and 71% is occupied by agriculture, 66% of which is used for arable land [20].
In terms of hydrochemistry, the strong and steady contribution from the limestone karst system leads to stable concentrations of geogenic and agriculture-derived solutes such as nitrate and chloride. A short-term dilution of these compounds may occur during storm events. Then, large amounts of event water with low solute content are introduced into the river [20]. Discharge data and chemical data were collected hourly with in situ analyzers from 07/1/2011 to 03/31/2012. This data set of observations includes over 6500 measurements of nitrate concentrations, electric conductivity, and discharge rates.
Here, we focus on discharge rate, Q, and nitrate concentration, C, (Figure 1a) although our proposed procedure could likewise be applied to other compounds. Over the time series, the average Q (0.93 m 3 /s) was similar to the long-term average (0.87 m 3 /s). The maximum Q (27.5 m 3 /s) represents an exceptional flood with a return period of about 20 years. The average C was 31.5 mg/L, but a range between 9.5 and 44.7 mg/L was captured.
Water 2021, 13, x 9 of 29 into the river [20]. Discharge data and chemical data were collected hourly with in situ analyzers from 07/1/2011 to 03/31/2012. This data set of observations includes over 6,500 measurements of nitrate concentrations, electric conductivity, and discharge rates.
Here, we focus on discharge rate, , and nitrate concentration, , (Figure 1a) although our proposed procedure could likewise be applied to other compounds. Over the time series, the average   and . In summary, the system is dominated by baseflow, which explains the clear tendency towards a chemostatic behavior. Figure 1b shows the classic approach to determine the export regime of a catchment by computing a regression slope of versus . The effect of the short-term dilutions by event water inputs results in a slightly negative relation between and , expressed by a small negative regression slope (−0.1804 in Figure 1b). This means that the Ammer River shows a rather chemostatic export regime for nitrate. The red circle in Figure 1b includes observations taken sequentially over time that belong to a hysteretic event.  In summary, the system is dominated by baseflow, which explains the clear tendency towards a chemostatic behavior. Figure 1b shows the classic approach to determine the export regime of a catchment by computing a regression slope of log 10 C versus log 10 Q. The effect of the short-term dilutions by event water inputs results in a slightly negative relation between Q and C, expressed by a small negative regression slope (−0.1804 in Figure 1b). This means that the Ammer River shows a rather chemostatic export regime for nitrate. The red circle in Figure 1b includes observations taken sequentially over time that belong to a hysteretic event. It depicts hysteresis in a highly visible fashion. Other hysteresis loops exist as well, e.g., the rightmost group of data points. Many more such loops are hidden in the bulk of the point cloud. With such unconventional residual statistics (i.e., loop patterns in the scatter around the regression line), the applied least-squares linear regression is not robust. Especially for smaller time series, almost any regression slope between 0 and −1 (and even outside these bounds) would be possible.
To show that these data do not comply with the regression assumptions (e.g., the mean is not constantly zero and the variance is not constant), we computed the residuals in the C-Q regression of the Ammer catchment. In Appendix A, we present the plots of the residuals, moving window average, and moving window variance ( Figure A1), as well as the correlation of Residual(t) with Residual(t-1) ( Figure A2). Figure 2 displays some examples generated with our RMM for a fixed hyper-parameter (α or t ch ), while the other three parameters are inferred by Bayesian inference using the full-time series data. Table 3 includes the inferred hyper-parameters used to generate the concentration time series shown in Figure 2. It depicts hysteresis in a highly visible fashion. Other hysteresis loops exist as well, e.g., the rightmost group of data points. Many more such loops are hidden in the bulk of the point cloud. With such unconventional residual statistics (i.e., loop patterns in the scatter around the regression line), the applied least-squares linear regression is not robust. Especially for smaller time series, almost any regression slope between 0 and −1 (and even outside these bounds) would be possible.

RMM Simulations
To show that these data do not comply with the regression assumptions (e.g., the mean is not constantly zero and the variance is not constant), we computed the residuals in theregression of the Ammer catchment. In Appendix A, we present the plots of the residuals, moving window average, and moving window variance ( Figure A1), as well as the correlation of Residual(t) with Residual(t-1) ( Figure A2). Figure 2 displays some examples generated with our RMM for a fixed hyper-parameter ( or ), while the other three parameters are inferred by Bayesian inference using the full-time series data. Table 3 includes the inferred hyper-parameters used to generate the concentration time series shown in Figure 2.     We would like to mention that the CPU time of applying our RMM model together with MCMC (for an ensemble size of 100,000) is less than three minutes on a standard desktop PC (Intel ® Core™ i7-7700 CP @3.60 GHz, 32 GB of RAM), which makes it a very affordable and efficient model. Table 3. Inferred hyper-parameters values when α or t ch are fixed using the full time-series data.

Characterization with Extensive Time-Series Data
In Table 4, we show the inferred mean posterior values of α, t ch , m k , and v k based on the full given discharge data Q t and measured concentration data C t . Table 4. Inferred values of the Ammer catchment using the whole time-series data (a total of 6604 measurements).

Parameter (Unit)
Inferred Value The referral value for α f ull is equal to 0.011 and corresponds to a chemostatic behavior (γ ≈ 0). Therefore, concentration mainly depends on the apparent release rate k (Equation (1)). This agrees with the hydrochemical arguments by Schwientek et al. [20] discussed in Section 3. t ch, f ull is equal to 106.4 h, which corresponds to 4.4 days. This value is similar to the assumed mean of four days in Table 1. Therefore, the Ammer river indeed presents roughly weekly cycles of nitrate concentration release into the river.
The average expected nitrate concentration is m k, f ull = 30.8, which lies between pristine water (1 mg/L) and the integral catchment output during baseflow (Section 2.2.1). Also, m k, f ull is very similar to the average C (31.5 mg/L) obtained in Schwientek et al. [20]. The variance of k is v k, f ull = 7.83, which indicates that the system has a relatively low dynamic in (log-)release. Figure 3 shows that the interpolation (black line) produced by the RMM, when applying inferred values in Table 4, fully goes through the data we used for calibration (red points). This figure also displays the credible intervals (CI) (2.5% and 97.5%, respectively) for daily observations (time = 0-90 days), every 4 days (time = 90-180 days), every 8 days (time = 180-240 days), and no sampling (time > 240 days). The 95% range is more narrow when more observations are considered, decreasing the uncertainty of the RMM interpolation. Thus, the RMM cannot only calibrate the hyper-parameters of the actual time series, but can also interpolate with quantified standard errors (grey lines). Figure 3 also shows some concentration predictions (colored lines) generated by the RMM after the calibration.  Table 4 and assimilated data (red points). Daily observed data are shown with white points. Data is assimilated daily for the first 90 days, every 4 days from 90 days to 180 days, and every 8 days from 180 days to 240 days. Predictions (colored lines) of the RMM are shown for a time larger than 240 days. The upper and lower CI are also shown in panel (a) (grey lines). Figure 4 shows the prior and posterior distributions of α for the sampling strategies described in Table 2. The minimum and maximum of the box plots mark the 95% CI (2.5% and 97.5% values, respectively). The red star marks the posterior mean of α f ull , whereas the black stars mark the posterior mean values inferred by the designs. Only design d = 16 can reproduce a posterior α with very high confidence. Yet, the other event-based designs involving samples at high Q (High Q events d = 13 to 16, and Low and High Q events d = 17 to 20) are surprisingly confident, although using only very few samples. All these strategies present a range of the 95% CI lower than 0.200 (Table 5). Table 5 summarizes the results of the 20 sampling strategies for the RMM, including posterior alpha α post,d (Equation (16)), the value of the 95% range, and values of the 2.5% CI and 97.5% CI.

Characterization with Sparse Data (Sampling Strategies)
Generally, increasing the number of samples from 5 to 40 decreases the bias of the posterior mean α post,d . It also increases the reliability of the designs: the range of the CI decreases. For instance, the time frequency strategy produces a quick reduction of the α bias when the number of samples increases: α post,d decreases from 0.351 (d = 1 with five samples) to 0.113 (d = 4 with 40 samples). In addition, the width of the 95% range for d = 4 is reduced from the prior value of 0.95 (0.975-0.025) to 0.28 (0.29-0.01). However, strategies based on river discharge frequency and Low Q events do not follow this tendency; instead, their biases of α post,d seem to increase when increasing the number of samples from 5 to 20 (e.g., d = 5 versus d = 7). However, these two types of strategies seem to benefit when the number of samples is increased to 40 (see Section 5.3). Therefore, an improvement will be expected when more samples are taken for these two types of strategies.  As already mentioned, when we apply sampling strategies based on events (Figure 4), we can observe two different performances. There are sampling strategies that decrease the bias and uncertainty of α post,d significantly, such as strategies based on High Q events Strategies that show the lowest reduction of uncertainty (when range CI 95% > 0.50) include both strategies based on frequency for a low number of samples (5 and 10, d = 1,2, 5, and 6) and strategies based on measuring Low Q events (d = 9 to 12). Obviously, strategies with only 5 or 10 samples involve less information; therefore, a slight improvement of the posterior is expected. However, as observed in Figure 4 and Table 5, strategies that collect data during High Q events even with a low number of samples produce a significant reduction of the bias and uncertainty of α post .
In summary, from the five types of strategies (Figure 4), the High Q events strategy stands out as the best strategy because it not only decreases the posterior bias, but also reduces the uncertainty width. Another good option that improves the prediction of α post is applying the Low and High Q events strategy, although it presents a higher bias and uncertainty. Therefore, to maximize the variance reduction of the export strength and minimize the data collection cost, we recommend High Q events strategies.

Comparison of the RMM to the Linear Regression Approach
Both methods, namely RMM and linear regression, agreed that the Ammer catchment shows a chemostatic regime with α f ull = 0.011 and α reg = 0.180, respectively, when all 6604 samples were used for the analysis. However, we strongly believe that, with much fewer data used, α post,d inferred with our method is more robust than α reg,d obtained by simple regression. Table 5 also includes a summary of α reg,d produced by simple regression when applying the different sampling strategies in Table 2. While the 95% CI for the RMM is based on the likelihood-weighted MCMC results, the 95% CI for the regression slope is taken from the standard textbook equation, as in [31].
Our method shows robustness: α post,d decreases, approaching the inferred value, when increasing the number of samples. In addition, when increasing the number of samples, the width of the confidence range decreases. On the other hand, α reg,d can show negative values for some sampling strategies or even values greater than one. These negative α reg,d do not make any sense since α should always be within the interval [0,1] for the Ammer catchment; thus α reg,d shows a much larger impact of randomness. For example, when we increase the number of samples, for d = 1 to d = 2 and d = 3, α reg,d increases and decreases inconsistently.
Interestingly, designs d = 4 and d = 16 produce α reg,d close to α f ull , which is probably a consequence of the high number of samples (d = 4 time frequency with 40 samples) and a consequence of choosing a suitable sampling strategy based on high discharge rates (d = 16 High Q events with 40 samples).
To summarize, we have shown that the simple regression approach can be misleading for small, economically efficient sampling campaigns. We have also shown the strongly increased robustness of our model compared to conventional regression. Therefore, the results of this study can help characterize the export regime of a catchment and manage water pollution in rivers at lower monitoring costs. Figure 5 shows plots of −log 10 C versus log 10 Q for event-based designs with 40 samples (d = 12, 16, and 20). For each design, we show the slope produced by linear regression α reg = 0.180 (black line) and the observed values (red circles).

Performance of Sampling Strategies
Strategies based on Low Q events show a low performance (Section 4.3) because they cover small ranges of C and Q (Figure 5a). As already seen in Section 4.3, the best strategies involve sampling during one or more peak events or several events of low discharge. However, the designs based on Low and High Q events present a larger uncertainty than designs based only on High Q events. Measuring during large peak events (High Q events strategy) ensures that data include changes of C when Q increases and decreases (loops seen in Figure 5b,c on the right side). Measuring at low and high events also allows for a wide range of Q values (Figure 5c).
On the other hand, the worst strategies with a higher bias of α post are based on river discharge frequency and on measuring events at low Q. Therefore, combinations of samples belonging to different flood events (e.g., d = 5, 6, and 7, i.e., when the number of samples is between 5 and 20 for river discharge frequency) do not produce a good α post,d , since the valuable information of the memory-type relationship between Q and C is missing. Figure 6a displays d = 5 (river discharge frequency with five samples) and Figure 6b shows d = 9 (low Q events with five samples). Figure 6 clearly shows that five samples are not enough to catch loop changes of C-Q or cover a wide range of Q.
When only Low Q events are measured (d = 9 to 12), the strategy fails to span a range of different Q values even when the number of samples is increased to 40; therefore, they do not produce good posteriors. Thus, some sampling strategies improve by increasing the number of observations, but others will not benefit.
In summary, we have shown that sampling strategies based on extreme events (including High Q or both High and Low Q) are key to inferring the catchment behavior with low bias and low uncertainty. This is important, regardless of whether using our RMM or linear regression.

Comparison to Other Studies
The idea that the perceived C-Q relationship changes when samples are taken by time frequency or by events is not new. Recent studies have distinguished between low and high discharge rates to determine C-Q relationships [5,6,22,36,40], and others have looked at the frequency of sampling [14,22] or sampling strategy [34].
Minaudo et al. [40] created a model in which C is subject to slow Q (seasonal) and fast Q (storm event) variations. Knapp et al. [5] showed that the C-Q relationship differed significantly for some solutes (NO 3 , K, Cl) depending on when samples were observed. The C-Q relationship of these solutes was differently impacted when discharge changes were caused by individual peak events or by longer time scales. These studies observed opposite export regimes based on when measurements were observed. Contrarily, the RMM shows robustness (Table 5) since it does not result in "opposite" export regimes depending on when samples are taken.
Floury et al. [14] evaluated the effects of lowering the sampling frequency on the hydrochemical signal by applying artificial subsampling to the high-frequency sampling. They found that concentration signals were degraded by being smoothed. Bieroza et al. [11] showed that a lower number of samples produced more uncertainty in water quality parameters as a function of the C-Q slope. In our study, we observed the same trend. However, our RMM can predict a good export regime with only five samples if the sampling strategy is based on High Q events (d = 13). Nevertheless, the prediction notably improves when the number of samples increases. Zhang and Ball [34] tested different sampling approaches and used various models (including the WRTDS model). Their best strategy included a mixture of 12 regular samples and 8 stormflow samples per year. However, they did not consider a sampling strategy that only accounted for High Q events. The finding that sampling at High Q events is better informs concentration pattern evolution with discharge was expected. At baseflow, nitrate concentrations remain quite constant; the information added by sampling again at low discharges is low. In contrast, at high discharges, where nitrate concentrations evolve the most, adding another sample is valuable to characterize the concentration variation with discharge.

Summary and Conclusions
In this work, we proposed a better and more robust alternative to model the behavior of a catchment. Generally, the export regime of a catchment (chemostatic, dilution, or mobilization type) is characterized by the slope obtained by plotting log 10 C versus log 10 Q measured in the river at the catchment outlet. However, we can usually observe in this representation that measurements show hysteresis effects of the system, defying the assumptions of linear regression. With these assumptions violated, linear regression results can be suboptimal, biased, and strongly misleading. In contrast, our regime-and-memory model (RMM) includes an explicit parameter to model the type of export regime and a parameter to account for the characteristic time scale of release and chemical turnover in the catchment. Thus, our model can mimic these memory effects of the catchment.
For demonstration, we used a high-frequency data series of nitrate concentrations collected with a high-frequency in situ analyzer in a catchment in Germany for nine months. We showed that our simple model is able to mimic the nitrate dynamics of the catchment, including the hysteresis effect. Also, our model showed robustness and consistency in inferring the export regime compared to the linear regression, even for much smaller data sets. Thus, our RMM has the potential to improve export regime predictions.
Regarding the sampling strategies, we found out that the strongest sampling strategies are based on High Q events for the Ammer catchment. These strategies cover a large range of discharge rates and accompanied C values. Representativeness of the collected data (in the face of short-term fluctuations in compound release and catchment memory effects) is ensured by stretching the sampling days across events. From our analyses, we observed that five samples/observations are necessary to produce a significant reduction of α bias (or better accuracy) if the right sampling strategy is implemented (High Q events strategy). However, to decrease the uncertainty (or increase the precision), one must increase the number of samples.
Thus, for catchments with similar characteristics to the Ammer catchment, we recommend collecting the samples when peak events occur since this provides a wide range of Q in a short time interval. However, it bears the risk that these large events are dependent on the season, so their observation would be restricted seasonally. To avoid this seasonal restriction, alternatively, we recommend measuring samples at uncorrelated events at low Q. However, the uncertainty of the export regime will be larger than when C is measured only at High Q events. Hence, sampling strategies based on extreme events, both at low and high Q, are key to reducing the prediction uncertainty of the catchment behavior, and event-based thinking can be reasonably generalized to catchments with C-Q behavior that can be represented by our RMM.
Future work with the RMM includes validating these statements with longer time series, more solutes, as well as data from different catchments. Also, further research will be needed to extend the presented statistical approach of the RMM to consider α for the mobilization transport regime (for positive correlations of log 10 Q versus log 10 C), in which wet periods do not result in dilution but an additional mobilization of substances.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Residuals in the C-Q Regression of the Ammer Catchment
Residuals are computed following the next equation: where f (log 10 Q(t)) is the function obtained by linear regression for log 10 C(t). Figure A1 shows that the residuals do not have a constant mean equal to zero and that its variance is not constant.