An Evaluation of Catchment Transit Time Model Parameters: A Comparative Study between Two Stable Isotopes of Water

Using δ18O and δ2H in mean transit time (MTT) modeling can ensure the verifiability of results across catchments. The main objectives of this study were to (i) evaluate the δ18O- and δ2H-based behavioral transit time distributions and (ii) assess if δ18O and δ2H-based MTTs can lead to similar conclusions about catchment hydrologic functioning. A volume weighted δ18O (or δ2H) time series of sampled precipitation was used as an input variable in a 50,000 Monte Carlo (MC) time-based convolution modeling process. An observed streamflow δ18O (or δ2H) time series was used to calibrate the model to obtain the simulated time series of δ18O (or δ2H) of the streamflow within a nested system of eight Prairie catchments in Canada. The model efficiency was assessed via a generalized likelihood uncertainty estimation by setting a minimum Nash–Sutcliffe Efficiency threshold of 0.3 for behavioral parameter sets. Results show that the percentage of behavioral parameter sets across both tracers were lower than 50 at the majority of the studied outlets; a phenomenon hypothesized to have resulted from the number of MC runs. Tracer-based verifiability of results could be achieved within five of the eight studied outlets during the model process. The flow process in those five outlets were mainly of a shallow subsurface flow as opposed to the other three outlets, which experienced other additional flow dynamics. The potential impacts of this study on the integrated use of δ18O and δ2H in catchment water storage and release dynamics must be further investigated in multiple catchments within various hydro-physiographic settings across the world.


Introduction
Mean transit time (MTT) has been globally used as an effective metric for describing the water storage and release mechanisms of catchments. The MTT is defined as the average time water spends travelling from an input point (such as the catchment surface) to an output point (which could be a stream outlet or a well) [1]. MTT has important applicability in evaluating the streamflow 1.
To evaluate the distribution of δ 18 O-and δ 2 H-based behavioral parameters from which the MTT was extracted.

2.
To determine how early in the model time-step δ 18 O-and δ 2 H-based behavioral solutions are achieved. 3.
To assess the agreement (or lack thereof) of the absolute MTT values as retrieved from the δ 18 O and δ 2 H model kernels and to infer if they lead to similar conclusions about the nested catchment system.
Achieving the objectives will potentially help inform watershed scientists on the implications of applying the dual water isotopes (δ 18 O and δ 2 H) in catchment water storage and release dynamics studies in light of the influence of evaporative processes on the study results. This will help in advancing studies on ways to improve result acceptability in TBC MTT modeling.

Materials and Methods
The focus of this study was on a nested system of a 74.4 km 2 catchment comprising of eight outlets. The outlets are named in order of flow regime from upstream to downstream as MS1, MS2, MS3, MS4, MS7, MS9, HWY240, and Miami (with 0.28, 0.19, 0.48, 0.31, 1.27, 1.88, 34.6, and 74.4 km 2 as corresponding areas, respectively); MS1 and Miami forming the headwater and final exit, respectively, of the whole catchment system. This nested system of catchments, called the South Tobacco Creek Watershed (STCW), was located in south central Manitoba, Canada ( Figure 1). The STCW was generally composed of a shale bedrock overlain by moderate to strong calcareous glacial till and clay-loam soils [27]. The HWY240 outlet sat on the edge of a NW-SE trending escarpment, also called the Pembina Escarpment. The MS4 and MS7 outlets had small retention ponds located immediately upstream for stormwater control purposes; the pond behind MS7 had the tendency to overflow during snowmelt and extreme storm events. Total annual precipitation was 529.5 mm (including 30% of snowfall), while the mean daily air temperature was 2.9 • C [28]. Long term  values of mean total annual precipitation was 512.3 mm, while that of mean daily air temperature was 2.86 • C [28]. Precipitation (including snowmelt captured by snow lysimeters) and stream water samples were collected from the eight outlets across multiple seasons (spring, summer, and fall) of 2014, at a frequency of at least once every week. The sampling period began during the freshet (in March) and ended just at the start of the snowfall in October. Concurrently, precipitation amounts were obtained from the nearby weather survey of the Canada station. All the samples were stored in 10 mL glass vials, sealed with parafilm, and kept at 4 • C until analysis. They were then tested for δ 18 O and δ 2 H using a Picarro TM Liquid Water Isotope Analyzer (LWIA, model L2130-i) based on cavity ring-down spectroscopy (CRDS) technology. Delta (δ) values were recorded in permil (% ) deviations from the Vienna standard mean ocean water (VSMOW) [29], with a precision of 0.025 % and 0.1 % for δ 18 O and δ 2 H, respectively. Time series of δ 18 O (or δ 2 H) of precipitation and streamflow were obtained. We then assumed the transit time distribution ((TTD) g(τ))-the kernel from which the mean transit time (MTT) is extracted from-of the nested catchment system to belong to a family of gamma distributions (Equation (1)) for which the ((MTT (τ)) is equivalent to αβ (Equation (2)): The assumption of a choice of a gamma transfer function was appropriate based on the general geology of the catchment and was confirmed by comparisons of goodness-of-fit values during the initial model optimization process of "small model runs", where other transfer functions were also tested. where α and β are the shape and scale factors of the gamma distributions, respectively. Convolution integral modeling was performed in the time domain using the δ 18 O time series data of precipitation and streamflow, which was then repeated for the δ 2 H data. The two-parameter gamma distribution model, chosen for the catchment system, was selected as a stationary TTD (noted as g(tτ)) and convolved with the amount-weighted δ 18 O (or δ 2 H) isotopic timeseries of precipitation (δ in (τ)) to predict the δ 18 O (or δ 2 H) isotopic timeseries in the streamflow (δ out (t)) [4,30] via Equation (3): For each of the eight sub-catchments, the input data record was artificially filled or extended by a sine-wave approximation technique [31,32]. A 30-year warm-up period was used to actuate the model; the isotopic time series data was looped 30 times backward-a process referred to as a "loop scenario" (e.g., [33])-to achieve the warm-up condition, at the end of which the calibration process began. The search for behavioral model parameters was conducted by means of 50,000 Monte Carlo (MC) simulations in MATLABR2017b. A generalized likelihood uncertainty error (GLUE) [34] analysis was performed by setting a minimum Nash-Sutcliffe ffficiency (NSE) threshold of 0.3 for behavioral parameter sets. The MTT was estimated by multiplying the α and β parameters associated with the MC simulation that yielded the highest NSE.

Results
The maxima of the transit time distribution (TTD) at the outlets appear to be higher for the δ 18 O category than the δ 2 H, the only exception being at the Miami outlet. In contrast, the minima TTDs appear to be generally lower for the δ 2 H class than for δ 18 O (Figure 2). The maximum TT was about 10,000 days for the δ 18 O class and about 9900 days for the δ 2 H class. A spearman's rank correlation analysis of the maximum TT across the eight sites showed no significance at the 95% level between  To help evaluate the first research objective, all behavioral parameter sets of α and β (i.e., the α and β values of all the MC simulations (out of the 50,000) that resulted in NSE scores of 0.3 and above) were retrieved from the δ 18 O-and δ 2 H-based models. Boxplots were then used to assess the distribution of transit times (TTs) within each tracer category. The percentage of simulations (out of the 50,000) that resulted in behavioral parameter sets within the δ 18 O and δ 2 H tracer categories was evaluated. The time-step of the MC simulation, at which the behavioral parameter sets corresponding to the highest NSE occurred, was noted for both δ 18 O and δ 2 H tracers and used to evaluate the second research objective. The third and final research objective was evaluated by multiplying the α and β parameter sets that corresponded with the MC simulation associated with the highest NSE. This was performed for both tracers and for all eight sub-catchments. Standard errors (±1σ) around the MTT were calculated from the behavioral TTs that emerged from the 95th percentile of NSEs. To ascertain if there were any significant differences (or lack thereof) between δ 18 O-and δ 2 H-based MTTs, a Spearman's rank correlation analysis was performed at the 95% significance level. The catchments were then assessed in the context of increasing order of MTTs according to both tracers.

Results
The maxima of the transit time distribution (TTD) at the outlets appear to be higher for the δ 18 O category than the δ 2 H, the only exception being at the Miami outlet. In contrast, the minima TTDs appear to be generally lower for the δ 2 H class than for δ 18 O ( Figure 2). The maximum TT was about 10,000 days for the δ 18 O class and about 9900 days for the δ 2 H class. A spearman's rank correlation analysis of the maximum TT across the eight sites showed no significance at the 95% level between the δ 18 O-and δ 2 H-based TTs (Rho = 0.61; p-value = 0.38). The maximum TT values across the studied outlets appear to occur at outlets MS1 and MS9 for each tracer class ( Figure 2). The minimum TT was 80 days and 70 days, respectively, for the δ 18 O and δ 2 H class, both occurring at site MS3. The median TTs were higher within the δ 18 O class at all the outlets except Miami ( Figure 2). Within the δ 18 O class, the highest median TT was about 6000 days and occurred at outlet MS1. A total of 4000 days was the highest median TT within the δ 2 H category, and it also occurred at outlet MS1 ( Figure 2). There were no marked differences in the ranges of TT values between the δ 18 O and δ 2 H classes across the sites, since the tracer specific boxes appear similar in size. All TT outliers appear to be on the upper end of the distribution in each tracer category, with values above 11,000 days ( Figure 2). The only outlets without TT outlier values were MS3, MS4, and the δ 2 H-based tracer class for Miami ( Figure 2). There was generally a lower percentage of behavioral TT (out of the 50,000 MC runs) within each δ 18 O and δ 2 H class across all the eight outlets; with values at MS3 and MS4, for instance, being less than 2% (Table 1). There was no behavioral TT parameter at outlet HWY240 within the δ 2 H class (Figures 2 and 3; Table 1). In general, across all the outlets, there was a higher number of behavioral parameters within the TTD of the δ 18 O tracer than there was for δ 2 H ( Table 1). The GLUE bound appears to show some seasonality in the prediction of the δ output variable; the uncertainty band is bigger during the freshet and smaller as the season progresses into summer and fall ( Figure 3).
The α (shape factor) values corresponding to the best NSE score at the outlets ranged from 0.04 to 0.93 for the δ 18 O-based tracer and 0.14 to 0.93 for the δ 2 H-based tracer ( Figure 4; Table 1). The β (scale factor) counterparts were 2 to 4747 for the δ 18 O-based tracer and 50 to 3701 for the δ 2 H-based counterpart ( Figure 5; Table 1). The highest α, corresponding to the highest NSE, occurred at outlet MS3 in both tracer classes whiles the lowest occurred at outlet HWY240 within the δ 2 H class. Within the δ 18 O class, the highest β, corresponding to the highest NSE, occurred at outlet MS2 while the lowest occurred at HWY240. The highest δ 2 H-based β also occurred at outlet MS2 while the lowest occurred at outlet MS3 ( Table 1). The time-step for the convergence on the best α at all the eight outlets matches the time-step for converging on β (Table 1). In general, and within each tracer class, the simulation appears to be converging on the best behavioral α and β at later time-steps (beyond time-step 25,000) ( Table 1). The absolute MTTs from the δ 18 O-and δ 2 H-based tracers were different except for the value at the Miami outlet (95 days for both tracer categories) (Figure 6h). In general, the MTT of the catchments in this study were less than 9 months (Table 1). However, tracer specific MTTs at the outlets were, different. A significance test at the 95% level did not show any association between δ 18 O-and δ 2 H-based MTT results (Rho = 0.56; p-value = 0.21). However, when the tracer specific MTTs were ranked in increasing order, five out of the eight outlets matched (Figure 7). The α (shape factor) values corresponding to the best NSE score at the outlets ranged from 0.04 to 0.93 for the δ 18 O-based tracer and 0.14 to 0.93 for the δ 2 H-based tracer ( Figure 4; Table 1). The β (scale factor) counterparts were 2 to 4747 for the δ 18 O-based tracer and 50 to 3701 for the δ 2 H-based counterpart ( Figure 5; Table 1). The highest α, corresponding to the highest NSE, occurred at outlet MS3 in both tracer classes whiles the lowest occurred at outlet HWY240 within the δ 2 H class. Within the δ 18 O class, the highest β, corresponding to the highest NSE, occurred at outlet MS2 while the lowest occurred at HWY240. The highest δ 2 H-based β also occurred at outlet MS2 while the lowest occurred at outlet MS3 ( Table 1). The time-step for the convergence on the best α at all the eight outlets matches the time-step for converging on β (Table 1). In general, and within each tracer class, the simulation appears to be converging on the best behavioral α and β at later time-steps (beyond timestep 25,000) ( Table 1). The absolute MTTs from the δ 18 O-and δ 2 H-based tracers were different except for the value at the Miami outlet (95 days for both tracer categories) (Figure 6h). In general, the MTT of the catchments in this study were less than 9 months (Table 1). However, tracer specific MTTs at the outlets were, different. A significance test at the 95% level did not show any association between δ 18 O-and δ 2 H-based MTT results (Rho = 0.56; p-value = 0.21). However, when the tracer specific MTTs were ranked in increasing order, five out of the eight outlets matched (Figure 7).

Discussion and Implications
A total of three major outcomes emerged from this study: (i) There were lower percentages (out of the 50,000 MC simulations) of behavioral parameters (parameters that passed the NSE > 0.3 threshold) at all the eight outlets for both the δ 18 O-and δ 2 H-based data; (ii) time-steps for the retrieval of the best behavioral shape and scale parameters associated with both tracers occurred at latter times of the simulation at the majority of outlets; and (iii) mean transit times associated with each tracer across the majority of outlets were less than nine months. Though δ 18 O-based MTTs did not match that of the δ 2 H category at the majority of outlets, there were general similarities in the outlet positions on the δ 18 O-and δ 2 H-based MTT ranks (Figure 7). Regarding outcome (i), the lower percentage of behavioral parameters makes it potentially easier for optimum behavioral parameter identifiability, but that also implies a lessening of the efficiency of model performance. Figures 4  and 5 suggest model insensitivity in the majority of the parameter space for both tracers. This is especially evident for the scale parameter search (Figure 5), which assumes model sensitivity within few areas of the parameter space. This model behavior potentially reduces the efficiency of the search for behavioral solutions. For instance, some of the worse NSEs were observed at outlets MS3 and MS4 (0.51 and 0.61 for δ 18 O-and δ 2 H-based tracer at outlet MS3; and 0.5 and 0.7 for counterparts at outlet MS4) and these outlets correspond to the lowest percentage of behavioral parameters (less than 2% at each of those two outlets) ( Table 1). Contrary to observations at outlets MS3 and MS4, results from outlets MS2, MS7, and MS9 show some of the best NSEs (Table 1). The outlets recording the best NSEs correspond to some of the highest percentages of behavioral parameters (Table 1) within both tracer categories across all eight outlets. Regarding Outcome (ii), behavioral solutions occurred at latter time-steps during the model run. This was observed within both tracer categories and across the majority of the studied outlets. This raises one key question: Does the model appear to be performing better as the simulation advances? If that is the case, then this has direct implications for model optimization in the form of the length of the warm-up period (which was 30 years in the current study) and the number of Monte Carlo simulations (which was 50,000 in this study). Given the average time it takes to complete these kinds of MATLAB-based MC MTT simulations (about two months for each tracer in the current study), it becomes imperative to consider other methods of model optimization which would constrain the parameter search space in order to make the model more efficient in the context of run times. Outcome (iii) reinforces the understanding that a lot of the water within the Canadian Prairie catchments is very young, i.e., less than three months old [11,35]. There is also the reaffirmation that absolute MTTs should not be interpreted as standalone values but must be confirmed with the convergence of results from other methods. This is particularly useful for the comparison of inter catchment storage functioning [25]. The lack of significant association between the two tracer-based MTTs highlights the potential effects of evaporative losses on, especially, the δ 2 H-based MTTs. Because the vapor pressure between 1 H/ 2 H is higher than 16 O/ 18 O, residual evaporated water appears to be more enriched in δ 2 H compared to δ 18 O, since these two isotopes are unequally affected by kinetic fractionation [3,36]. This phenomenon may explain why no behavioral model parameters were recorded for the δ 2 H-based MTT at HWY240 (Figures 2-6; Table 1). The fractionation dynamic is consistent with the prairies (a semi-arid terrain), where summer stream samples may be retrieved from stagnant (albeit slowly flowing) streams. Outcome (iii) further affirms the notion that the use of both δ 18 O and δ 2 H in modeling could help build confidence in result acceptability [4]: The ranking of the outlets using tracer specific MTTs leads to process implications as to why all eight outlets were not similarly ranked across the two tracers. A process interpretation, using isotope-based hydrograph separation techniques within the STCW [37], touts the five outlets that agreed in the context of δ 18 O and δ 2 H MTT ranks, as experiencing shallow subsurface flow (SSF) processes. The remaining three that failed to agree have other processes in play in addition to the SSF; MS4 experiences detention losses under the dam in between storm events; HWY240 receives water via deep percolation of passive stores of old water; and Miami receives Hortonian overland flow as well as groundwater upwelling during heavy storm events [35,37]. While a dossier of dominant flow processes across the Canadian prairies has been well documented in the study by [38], what is evident from this work, from the isotopic standpoint, is that during low flow conditions, sources of water in the stream may not necessarily be from deep groundwater, in view of the fact that the ages of the water are generally young (less than nine months). What this study rather appears to show is that SSF sources (made up of relatively young water) contribute greatly to streamflow at the majority of the studied outlets. One key exception was at the HWY240 outlet; the short MTT was at odds with the high old water fractions [37] and the relatively small fraction of young water (i.e., the fraction of stream water that is of age three months or less) [37]. However, further studies at HWY240, using hydrometric data [39], showed that runoff ratios were generally greater than 1 at HWY240, pointing to a potential piston flow mechanism akin to fast percolation of precipitation or meltwater through the fractured bedrock and the pushing out of old groundwater into the stream outlet in a manner consistent with the [40] version of celerity. The isotopic and hydrometric results validate the hypothesis that, at HWY240, the stream is fed by passive stores of groundwater during low flow conditions. The distribution of the α and β parameters in the NSE space across the sites (Figures 4 and 5) appear consistent with relatively shorter arrival times for the δ 2 H-related TTs (Figures 2 and 4); for example, the NSE versus α and β space plots (Figures 4 and 5) generally show relatively shorter α and β values around the behavioral NSEs-the parameter values at the maxima of the plots-for δ 2 H-related data across the majority of sites (except HWY240 and Miami). Except for site MS9, out of the remaining six, those outlets exhibited SSF processes reflected in the MTT ranks observed in Figure 7. The distribution of transit times in the NSE space show tracer similarity for outlets MS1 and Miami (Figure 6a,h). The Miami outlet had the same MTT results (95 days) when both δ 18 O and δ 2 H tracers were independently applied, while δ 18 O-and δ 2 H-based MTTs at outlet MS1 were almost identical. Assessment of the GLUE of the output variable revealed strong seasonality in the catchment system: The early part of the season appeared to be dominated by snowmelt recharge mechanisms, evidenced by the depleted δ compositions of the stream water and a relatively bigger uncertainty band (Figure 3). Later in the summer and fall, the uncertainty bands become narrower as rain events dominated (Figure 3). There is a strong suggestion of residual snowmelt signals within the catchment, thus affecting the efficiency of the predictions as the season progressed into rain-dominated conditions in June and July. This is reflected in the poor fit of the observed and predicted output variable during June and July the majority of sites (Figure 3).
While our results add to the existing knowledge of the joint use of δ 18 O and δ 2 H tracers in the studies of catchment water storage and release dynamics, it still has potential limitations in the form of parameter identifiability in the Monte Carlo simulation space (α and β were set at [0, 1] and [0, 50,000], respectively, after initial model optimization in the current study). If this parameter identifiability space could be narrowed, it would hold promise at improving model efficiency. Secondly, disagreement in the ranking of some of the outlets, according to the δ 18 O-and δ 2 H-based MTT results, still raises the question of influence of evaporative losses in catchment MTT studies. In recent years, the influence of evaporative losses in many isotope-based hydrologic studies has been minimized by the introduction of a normalizing parameter, such as d-excess [41]. Integrating the δ 18 O and δ 2 H data to obtain a time series of d-excess of precipitation and streamflow as input and output variables in MTT modeling may hold potential for future process hydrology studies. Gaining grounds on both issues of parameter identifiability and the minimization of evaporative influences will be worthwhile in our interpretation and application of catchment MTT results in making decisions regarding catchment hydrologic functioning.
Author Contributions: S.B. was involved in the research conceptualization, methodology, software acquisition, code troubleshooting and validation, data analysis, and original draft preparation. S.A.A.-A. was involved in data analysis, writing-review and editing. J.Q.-B. was involved in data analysis, writing-review and editing. M.C.W. was involved in image visualization, writing-review and editing. S.S.G. was involved in manuscript organization, writing-review and editing. G.K.A. was involved in writing-review and editing as well as contribution to responses to reviewer and editor comments.
Funding: This work was made possible, in part, by the award of the University of Manitoba Graduate Fellowship to the first author. Support was also made available by the Natural Sciences and Engineering Research Council of Canada (NSERC) through a Discovery Grant awarded to Genevieve Ali, the first author's PhD advisor.