Improving Parameter Transferability of GR4J Model under Changing Environments Considering Nonstationarity

: Hydrological nonstationarity has brought great challenges to the reliable application of conceptual hydrological models with time-invariant parameters. To cope with this, approaches have been proposed to consider time-varying model parameters, which can evolve in accordance with climate and watershed conditions. However, the temporal transferability of the time-varying parameter was rarely investigated. This paper aims to investigate the predictive ability and robustness of a hydrological model with time-varying parameter under changing environments. The conceptual hydrological model GR4J (G é nie Rural à 4 param è tres Journalier) with only four parameters was chosen and the sensitive parameters were treated as functions of several external covariates that represent the variation of climate and watershed conditions. The investigation was carried out in Weihe Basin and Tuojiang Basin of Western China in the period from 1981 to 2010. Several sub-periods with di ﬀ erent climate and watershed conditions were set up to test the temporal parameter transferability of the original GR4J model and the GR4J model with time-varying parameters. The results showed that the performance of streamﬂow simulation was improved when applying the time-varying parameters. Furthermore, in a series of split-sample tests, the GR4J model with time-varying parameters outperformed the original GR4J model by improving the model robustness. Further studies focus on more diversiﬁed model structures and watersheds conditions are necessary to verify the superiority of applying time-varying parameters. L.X. and J.C.; L.Z., and J.C.; supervision, L.X.; L.Z. and J.-S.K.; visualization, L.Z.; writing – original draft, L.Z.; writing – review and editing, L.X., D.L., J.C., and J.-S.K.


Introduction
The conceptual hydrological models are simplified representations of the complex rainfall-runoff processes in the real world. Hence, great emphases are placed on the calibration of parameters to improve the performance of hydrological models. The optimal constant parameter set for a catchment can be inferred using various approaches [1,2] under the assumption that both the climate and the catchment characteristics are stationary. However, the ability to produce reliable predictions with the constant parameter set has been challenged under changing climate and watershed conditions [3][4][5][6][7][8].
The transferability of the model parameters between periods with different climate and watershed conditions is essential for reliable application of the hydrological models. Plenty of previous studies have revealed the fact that model parameters are strongly dependent on the data period with which the hydrological models are calibrated [9][10][11]. In general, loss of robustness and variation in model performance can be witnessed when the model parameters are inappropriately transferred [12]. For example, a model calibrated under wet conditions is usually unlikely to perform well under dry conditions and vice versa. In addition, the optimal parameter set may become no longer suitable once the watershed conditions change significantly as a result of human activities (e.g., the construction of dams, afforestation and deforestation). The impact of such changes on hydrologic processes such as streamflow has been well documented [13,14]. In such cases, hydrological models with time-invariant parameters may show a poor predictive performance due to their inability to explicitly represent the changes within the catchment.
One approach to make the model perform robustly under changing environments is to identify the optimal parameter set with time consistent performance [15][16][17][18][19][20][21][22]. Such approach is based on a series of calibration procedures on different sub-periods, and careful identification of the parameter sets with consistent performance for all these sub-periods. All parameter sets are independently evaluated in every sub-period, and only the parameter set that performs well in all cases are finally selected. Nevertheless, this is achieved at the cost of a possible decline in model performance for some individual sub-periods.
Another approach to improve the predictive performance of hydrological models under changing environments is to allow model parameters to evolve over time [23][24][25][26][27][28][29][30][31][32][33]. For example, Wallner and Haberlandt [25] linked the time-varying model parameters of a modified version of the HBV (Hydrologiska Byråns Vattenbalansavdelning) model with the climate indices. The results showed a significant improvement in model performance for streamflow simulation when using the time-varying parameters, especially for low flows. Pathiraja et al. [26] demonstrated the potential for the data assimilation method to detect the temporal variation in all parameters of the Probability Distributed Model and to improve model performance for streamflow simulation. Westra et al. [27] reported a significantly improved streamflow simulation performance of the GR4J model when the parameter represents the production storage capacity was made dependent on covariates describing seasonality, annual variability, and longer-term trends. Deng et al. [28] assumed the parameters of the two-parameter monthly water balance model to be time-varying as functions of catchment properties. Their case study in southern China suggests that the incorporation of time-varying parameters can enhance the capability of hydrological models to produce reliable predictions under changing environments. These studies suggest that the time-varying parameter may serve as an effective compensation for the possible deficiency of model structure, which failed to represent changes in hydrological dynamics within the watershed.
However, few recent studies focus on the temporal transferability of the time-varying parameters (i.e., robustness under extrapolation). Some previous studies [25,27,28] have tested the temporal transferability of the time-varying parameter from only one calibration period to only one validation period. They reported that the streamflow simulation using a model with the time-varying parameter is better than the original model during the validation procedure. However, these studies failed to fully account for the cases where time-varying parameter is transferred between periods with contrasting climate and watershed conditions (e.g., from driest period to wettest period), which remains an interesting topic that deserves further researches.
This study aims to investigate whether a hydrological model with time-varying parameter would achieve a better streamflow simulation performance and parameter transferability under changing environments than its original form. The GR4J model was chosen for its simplicity in model structure and parsimony in model parameters [34]. The relatively sensitive parameters of the GR4J model were treated as time-varying and assumed to be a function of several external covariates. Several sub-periods with different climate and watershed conditions were set up for a series of split-sample test for the GR4J model with constant parameter and time-varying parameter. The investigation was carried out for Weihe Basin and Tuojiang Basin in western China.

The Original GR4J Model and GR4J Model with Time-Varying Parameter
The original GR4J model has four parameters, namely the production storage capacity x 1 , the groundwater exchange coefficient x 2 , the one day ahead maximum capacity of the routing storage x 3 and the time base of unit hydrograph x 4 , as presented in Table 1. More details about the GR4J model can be found in the work of Perrin et al. [34]. A comprehensive sensitivity analysis of all four parameters is needed to determinate which parameter is relatively more sensitive and should be treated as time-varying. To this end, the Sobol' sensitivity analysis method [35] was applied in this study. Considering that this study focuses on changing environments where stationary rainfall-runoff relationship is usually absent, the Sobol' sensitivity indices were computed at the monthly timescale. This temporally discretized sensitivity analysis was expected to provide significant information about model behavior and capture the diversity of parameter sensitivity during different periods [36].
The GR4J model with time-varying parameters is referred to as GR4J-T model hereafter. The selected time-varying parameters are treated as functions of several external covariates related to the alternation of climate or watershed conditions. The potential covariates for time-varying parameters are denoted as V j ( j = 1, 2, . . . , M) and their long-term mean values as V j ( j = 1, 2, . . . , M), where M is the total number of the external covariates. When the parameter x i (i = 1, 2, 3, 4) of the GR4J model is considered to be time-varying, it is denoted as x i,t , where t indicates the time. The general relationship of x i,t with the corresponding external covariates is assumed to be, which can be rewritten as, where x i,c stands for the constant value of parameter x i over the calibration period, while β i,j and λ i,j (λ i,j = β i,j · x i,c ), j = 1, . . . , M, are the corresponding regression parameters. The f j (·) is the link function between the time-varying parameter x i,t and the external covariate V j , which can take multiple forms such as f (z) = z (linear), f (z) = e z (exponential), or f (z) = ln(z + 1) (logarithmic), so as to account for possible linear or non-linear relationship of the time-varying parameter with the external covariates.
Choosing suitable external covariates is a critical issue for the utilization of Equation (1) and (2), but it is usually limited by the availability of dataset. Considering that in practice the daily changes in any external covariates could only cause negligible changes in the time-varying parameters, so only impacts of the monthly changes in any physical covariates on the parameter were investigated in this study, i.e., the time-varying parameters were updated monthly in every simulation run. Inspired by recent work [27,28], 8 external covariates related to variation of climate or watershed underlying conditions, including 1-month, 2-month and 3-month antecedent monthly precipitation and potential evapotranspiration (P 1 , P 2 , P 3, PET 1 , PET 2 and PET 3 ) and NDVI (Normalized Difference Vegetation Index) of current month (NDVI 0 ), one-month shifted NDVI (NDVI 1 ) (as shown in Table 2), were considered as candidate covariates for the time-varying parameter. Table 2. Description of external covariates for time-varying parameters.

Covariates
Unit Description P 1 mm One-month antecedent monthly total precipitation P 2 mm Two-month antecedent monthly total precipitation P 3 mm Three-month antecedent monthly total precipitation PET 1 mm One-month antecedent monthly total potential evapotranspiration PET 2 mm Two-month antecedent monthly total potential evapotranspiration PET 3 mm Three-month antecedent monthly total potential evapotranspiration NDVI 0 -NDVI value for the current month NDVI 1 -NDVI value for the previous month

Model Calibration and Evaluation
The differential split-sample test (DSST) is a common way to evaluate hydrological models between sub-periods with contrasting climate and watershed underlying conditions [37,38]. In practice, usually only two or three contrasting sub-periods can be identified due to the availability of the hydro-meteorological data. This may limit the potential of DSST for accessing the transferability of model parameters and drawing general conclusions. To overcome this limitation, the generalized split-sample test (GSST) proposed by Coron et al. [39,40] was chosen to test the temporal transferability of parameter for the GR4J and GR4J-T model in this study. The main objective of GSST procedure is to evaluate the model performance under as many and as varied climatic and watershed conditions as possible. To this end, a sliding window with a specific length (e.g., five years) is applied to define the sub-periods (as illustrated in Figure 1). In Figure 1, the blue bars indicate the sub-periods while the grey bars represent the remaining part of the time series. Between two adjacent sub-periods, the sliding window is moved by one year. When the sub-periods creation completed, both the GR4J and GR4J-T model were calibrated in each sub-period. Then the parameter sets obtained was used to perform validation test on independent sub-periods, i.e., the sub-periods that do not overlap with the calibration sub-period. For example, the parameter calibrated during SP1 won't be validated in SP2, SP3, SP4 and SP5, as illustrated in Figure 1. For more detail about GSST, the readers can refer to the work of Coron et al. [39,40]. The original GR4J model and the GR4J-T model have different numbers of calibrated parameters. For the original GR4J model, the number of calibrated parameters is 4, i.e., where act Q and sim Q are the observed and model-simulated streamflow series, respectively,  When the sub-periods creation completed, both the GR4J and GR4J-T model were calibrated in each sub-period. Then the parameter sets obtained was used to perform validation test on independent sub-periods, i.e., the sub-periods that do not overlap with the calibration sub-period. For example, the parameter calibrated during SP1 won't be validated in SP2, SP3, SP4 and SP5, as illustrated in Figure 1. For more detail about GSST, the readers can refer to the work of Coron et al. [39,40]. The original GR4J model and the GR4J-T model have different numbers of calibrated parameters. For the original GR4J model, the number of calibrated parameters is 4, i.e., θ = {x 1 , x 2 , x 3 , x 4 }. In the case of the GR4J-T model, when the parameter x i (i = 1, 2, 3, 4) is treated to be time-varying, its constant value x i,c and the corresponding regression parameters λ i,j ( j = 1, . . . , M) are calibrated together with other time-invariant parameters. For example, when only the parameter x 1 is regarded as the time-varying parameter, parameters that require calibration are θ = x 1,c , λ 1,1 , . . . , λ 1,M , x 2 , x 3 , x 4 . Similarly, when multiple parameters are assumed to be time-varying, e.g., x 1 and x 2 , the calibrated parameters become θ = x 1,c , λ 1,1 , . . . , λ 1,M , x 2,c , λ 2,1 , . . . , λ 2,M , x 3 , x 4 .
The widely used Kling-Gupta Efficiency (KGE) [41] was applied to assess the overall performance of the GR4J and GR4J-T model. The KGE is calculated as follows, where Q act and Q sim are the observed and model-simulated streamflow series, respectively, r(Q act , Q sim ) is the Pearson's correlation coefficient between the observed and simulated streamflow series, µ(Q act ) and µ(Q sim ) are the mean values of the observed and simulated streamflow series, respectively, σ(Q act ) and σ(Q sim ) are the standard deviation of the observed and simulated streamflow series, respectively. KGE value ranges from −∞ to 1, with a value closer to 1 indicating a better simulation performance. The bias between the observed and simulated streamflow (BIAS) was also applied, which is calculated as follows, where Q act (t) and Q sim (t) are the observed and model-simulated streamflow at time t, respectively. BIAS with a value of 0 indicates no bias, and a value above 0 means an overestimation of the total streamflow volume. Both the GR4J and GR4J-T models were calibrated using the SCEM (Shuffled Complex Evolution Metropolis) optimization algorithm [2,42], which has been widely used for practical assessment of parameter uncertainty in hydrological modeling. The SCEM algorithm, partially inspired by the SCE-UA (Shuffled Complex Evolution -University of Arizona) algorithm [1], merges the strengths of Metropolis-Hastings sampling [43,44], controlled random search, competitive evolution and complex shuffling to evolve a population of sampled points to an approximation of the posterior distribution of the parameters. Besides, the SCEM method can identify the most likely parameter set and meanwhile its underlying posterior probability distribution in every single run [45]. The likelihood of a parameter set was calculated as the corresponding KGE value. Considering that the likelihood value must be nonnegative and monotonically increasing with improved performance, the parameter set that leads to a KGE value below 0 was rejected during the evolution procedure within SCEM. The convergence of the algorithm was determined using the Gelman-Rubin statistic [46], which is calculated on the posterior densities to check whether convergence to a stationary target distribution has been reached. For more detail about the SCEM optimization algorithm, the reader can refer to the work of Vrugt et al. [2].
After the convergence of the SCEM optimization algorithm, the final posterior distribution of parameter can be obtained. In total 5000 parameter sets were sampled from the posterior distribution to account for parameter uncertainty in this study. It should be noted that the 5000 parameter sets can lead to very similar streamflow simulation performance in terms of the KGE criteria, and the one corresponding to the highest KGE value (i.e., the KGE-best) was chosen to represent the estimate of the global optimum and used for the model evaluation and selection.

Temporal Transferability Test of Model Parameters
When a parameter set θ that calibrated from the sub-period D (denoted as "donor period" hereafter) is transferred to the sub-period R (i.e., validation, denoted as "receiver period"), the Kling-Gupta Efficiency (KGE) can be rewritten as, where Q act,R is the simulated streamflow series of sub-period R, Q sim,R [θ D ] is the simulated streamflow series of sub-period R using parameter set obtained from sub-period D. In addition, KGE R→R is defined to access the streamflow simulation performance during sub-period R using parameter sets obtained from sub-period R.
Note that the temporal transferability test on one single parameter set could be less persuasive. To fully evaluate the temporal parameter transferability of both the GR4J and GR4J-T model, the 5000 parameter sets from each sub-period were then validated at other independent sub-periods, and the average performance of both models were compared. To this end, a parameter transferability criteria (PTC) is defined as: where KGE c D→R (n) and KGE t D→R (n) represent the KGE D→R value in the case of the GR4J and the GR4J-T model using the n th parameter set, respectively. Here, N takes the value of 5000. Note that PTC with a value above 0 indicates a better parameter transferability for the GR4J-T model over the original GR4J model when transferred from sub-period D to sub-period R.

Study Data and Area
Weihe Basin and Tuojiang Basin in Western China were chosen as the study areas. Weihe Basin is located between the coordinates 33 • 40 -37 • 26 N and 103 • 57 -110 • 27 E, with a drainage area of 148,000 km 2 above Huaxian hydrological station, as shown in Figure 2. The elevation within Weihe Basin ranges from 3671 m in the western upstream region to 318 m in the eastern downstream region. Dominated by the semi-arid continental monsoon climate, most precipitation and flood events in Weihe Basin occur in late summer and early autumn. For Weihe Basin, climate variability is significant and human activities have been proved to be the main cause of the alternation of flow regimes. Previous studies have shown that the annual streamflow of the gauge Huaxian has been declining over the past decades [47][48][49], mainly due to the increasing human activities including the agricultural irrigation, the construction of large water control projects and the implementation of the water-soil conservation projects [50,51]. In addition, the variations of the annual precipitation have also contributed to the reduced annual streamflow in Weihe Basin [52].  Tuojiang Basin (28 • 88 -30 • 29 N, 105 • 44 -108 • 20 E) is an upstream tributary of the Yangtze River, with a total length of 712 km. Tuojiang Basin covers 19,613 km 2 with Fushun hydrological station as the catchment outlet. The elevation within Tuojiang Basin ranges from 264 to 4741 m and decreases from northwest to southeast. Tuojiang Basin is dominated by the subtropical monsoon climate, with most precipitation and flood events occurring in summer and early autumn. The annual streamflow of Tuojiang Basin was also in a downtrend. However, few recent studies focus on Tuojiang Basin and the reason for the declined annual streamflow remains unclear.
The input data for the GR4J model include precipitation (P) and potential evapotranspiration (PET). The precipitation and air temperature data were provided by National Meteorological Information Center of China [53] on a daily basis for the period from 1981 to 2010 and has been widely validated for its quality in previous studies [47][48][49]. Limited by availability of the meteorological dataset, the potential evapotranspiration (PET) was calculated using the Blaney-Criddle method [54] from the daily average air temperature and the daily sunshine duration. To calibrate and validate the GR4J model, the observed streamflow (Q) series from the gauge Huaxian and Fushun were obtained for the period 1981-2010. The NDVI dataset for the period from 1981-2010 was obtained from the GIMMS (Global Inventory Modelling and Mapping Studies) NDVI-3g product [55], which has been widely validated [55,56]. Since its temporal resolution is 15-day, the monthly NDVI is represented by the mean value of the two NDVI values in each month.

Diagnostics of Hydrological Nonstationarity
Considering the available hydro-meteorological dataset for Weihe Basin and Tuojiang Basin are for the period 1981-2010, in total 26 sub-periods of 5-year in length are set up following the method described in Figure 1. The reasons for the selection of 5-year as the sub-period length are: (1) 5-year is sufficient for continuous hydrological simulation, (2) a large number of sub-periods (26 in this study) can be set up given that only 30-year records were available. To identify the long-term variation of the hydro-meteorological data series of both basins, the non-parametric Mann-Kendall method [57,58] was applied to examine the trends in the annual P, PET, Q and annual runoff ratios (RR) series of the period 1981-2010, and the result is shown in Table 3. The results show a significant downtrend for the annual Q and RR series, a slight downtrend for annual P series and uptrend for the PET series in Weihe Basin. In the case of Tuojiang Basin, a slight downtrend for annual P series and significant downtrend for annual PET, Q, and RR series are presented. It can be found that both basins witnessed a clear nonstationary in rainfall-runoff relationship. In terms of the annual NDVI series, Weihe Basin shows a clear uptrend while Tuojiang Basin shows no clear changes. This may be attributed to the fact that Weihe Basin has experienced an incessant construction of water-soil conservation projects (e.g., afforestation) since the early 1980s [49].
Variability is also apparent between longer periods, as shown in Figure 3, where the relative mean P, PET, Q, and RR values for all sub-periods (5-year sliding window) are plotted for both basins.  Table 3. Mann-Kendal (MK) test results of the annual hydro-meteorological data series, the annual runoff ratios and the annual NDVI data series for Weihe Basin and Tuojiang Basin. The critical value of the MK statistic is Z 1−α/2 = 1.960 with α = 0.05.

Basin Data Z statistic Trend
Weihe Basin

4.2.Parameter Estimation for the GR4J and GR4J-T Model
To incorporate time-varying sensitivity analysis, the Sobol' sensitivity analysis was repeated at a monthly temporal resolution rather than over the whole period (1981-2010). The monthly sensitivity indices (including the first-order and total-order) of each parameter were therefore obtained with KGE as the metric. More detail about the time-varying sensitivity analysis can be found in the work of Herman et al. [36]. Figure 4 provides a simple and direct interpretation of the relationships between the sensitivity of the GR4J's parameters and changing precipitation and streamflow conditions, which was achieved by sorting the monthly sensitivity indices (only the totalorder were presented for brevity) for both basins along ascending gradients of monthly precipitation and streamflow.
For Weihe Basin, the parameter 1 x , which controls the production storage capacity, shows a strong sensitivity during both high-flow and low-flow months. In contrast, the parameter 2 x , 3 x

Parameter Estimation for the GR4J and GR4J-T Model
To incorporate time-varying sensitivity analysis, the Sobol' sensitivity analysis was repeated at a monthly temporal resolution rather than over the whole period (1981-2010). The monthly sensitivity indices (including the first-order and total-order) of each parameter were therefore obtained with KGE as the metric. More detail about the time-varying sensitivity analysis can be found in the work of Herman et al. [36]. Figure 4 provides a simple and direct interpretation of the relationships between the sensitivity of the GR4J's parameters and changing precipitation and streamflow conditions, which was achieved by sorting the monthly sensitivity indices (only the total-order were presented for brevity) for both basins along ascending gradients of monthly precipitation and streamflow. The correlation between monthly runoff-ratios and monthly external covariate is summarized in Figure 5 in term of Spearman's rank correlation coefficient. Note that the monthly runoff-ratios were calculated as the ratios of monthly runoff depth (mm) and the monthly total precipitation (mm). It can be found that the monthly runoff-ratios has a strong correlation with PET1 and NDVI0 for Weihe Basin. In the case of Tuojiang Basin, the covariates with high correlation coefficient are P2 and NDVI0. This correlation analysis was done to justify the reasonability of incorporating the abovementioned external covariates into the time-varying parameters. It may be arbitrary to assert that an external covariate with a weak correlation coefficient is inappropriate and useless. In this study, all 8 external covariates were applied to construct the time-varying parameters for both Weihe Basin and Tuojiang Basin, which is detailed in the following section. For Weihe Basin, the parameter x 1 , which controls the production storage capacity, shows a strong sensitivity during both high-flow and low-flow months. In contrast, the parameter x 2 , x 3 and x 4 are less sensitive. In the case of Tuojiang Basin, the parameter x 1 and x 3 dominate the months with low-flow and less-precipitation, while the parameter x 2 and x 4 show relatively weaker sensitivity. Thus, the parameter x 1 is assumed to be time-varying for Weihe Basin, and the parameter x 1 and x 3 for Tuojiang Basin.
The correlation between monthly runoff-ratios and monthly external covariate is summarized in Figure 5 in term of Spearman's rank correlation coefficient. Note that the monthly runoff-ratios were calculated as the ratios of monthly runoff depth (mm) and the monthly total precipitation (mm). It can be found that the monthly runoff-ratios has a strong correlation with PET 1 and NDVI 0 for Weihe Basin. In the case of Tuojiang Basin, the covariates with high correlation coefficient are P 2 and NDVI 0 . This correlation analysis was done to justify the reasonability of incorporating the abovementioned external covariates into the time-varying parameters. It may be arbitrary to assert that an external covariate with a weak correlation coefficient is inappropriate and useless. In this study, all 8 external covariates were applied to construct the time-varying parameters for both Weihe Basin and Tuojiang Basin, which is detailed in the following section. To reduce the influence of the initial state, a spin-up period of 1-year was used at the parameter calibration procedure for each sub-period, and the simulated streamflow within the spin-up period was excluded in the model assessment procedure. To fully explore the potential for the GR4J-T model in streamflow simulation, all combination of the eight external covariates were investigated and their corresponding performances were compared in terms of the KGE and BIAS criteria. Considering a large number of model assessment procedure (e.g., 255 covariate combinations for each sub-period in Weihe Basin , 1   2  3  4  5  6  7  8  8  8  8  8  8  8  8  8 255 C C C C C C C C + + + + + + + = ), only a small part of the results is presented hereafter for the sake of brevity. For each covariate combination, totally 5000 parameter sets were obtained from the calibration procedure and only the KGE-best for each covariate combination is shown here. For example, The corresponding equations for time-varying parameter 1,t x are also presented in Table 4. It should be noted that three forms of link functions including 'linear', 'exponential' and 'logarithmic' were used to link the external covariates with the time-varying parameter. Three link functions lead to similar model performance, while the 'linear' was always the best and simplest one. So only the 'linear' is shown here. It can be found that the streamflow simulation performance significantly improves when parameter 1 x was allowed to change over time. For example, the KGE value is 0.752 when only the covariate 1 P was considered when compared to 0.727 for the original GR4J model. The difference in BIAS criteria (from −16.1% to −6.3%) also indicates a better water balance simulation of the GR4J model with time-varying parameter. Furthermore, the highest KGE value (0.761) was achieved when  To reduce the influence of the initial state, a spin-up period of 1-year was used at the parameter calibration procedure for each sub-period, and the simulated streamflow within the spin-up period was excluded in the model assessment procedure. To fully explore the potential for the GR4J-T model in streamflow simulation, all combination of the eight external covariates were investigated and their corresponding performances were compared in terms of the KGE and BIAS criteria. Considering a large number of model assessment procedure (e.g., 255 covariate combinations for each sub-period in Weihe Basin,C 1 8 , only a small part of the results is presented hereafter for the sake of brevity. For each covariate combination, totally 5000 parameter sets were obtained from the calibration procedure and only the KGE-best for each covariate combination is shown here. For example, Table 4 presented the streamflow simulation performance for 8 different cases (C0-C7) during the sub-period 1 (SP1) in Weihe Basin. The case C0 was designed to represent the original GR4J model, where the value of parameter x 1 is invariant over time. Under cases C1-C7, the external covariate P 1 , PET 1 and NDVI 0 were incorporated into the time-varying parameter x 1,t . The corresponding equations for time-varying parameter x 1,t are also presented in Table 4. It should be noted that three forms of link functions including 'linear', 'exponential' and 'logarithmic' were used to link the external covariates with the time-varying parameter. Three link functions lead to similar model performance, while the 'linear' was always the best and simplest one. So only the 'linear' is shown here.
It can be found that the streamflow simulation performance significantly improves when parameter x 1 was allowed to change over time. For example, the KGE value is 0.752 when only the covariate P 1 was considered when compared to 0.727 for the original GR4J model. The difference in BIAS criteria (from −16.1% to −6.3%) also indicates a better water balance simulation of the GR4J model with time-varying parameter. Furthermore, the highest KGE value (0.761) was achieved when P 1 , PET 1 and NDVI 0 were incorporated into the time-varying parameter.

Streamflow Simulation Performance of the GR4J and GR4J-T Model
Similarly, the model assessment procedure was repeated for the rest sub-periods (SP2-SP26). Table 5 presents the streamflow simulation performance of the GR4J model and the GR4J-T model for all 26 sub-periods in Weihe Basin. Note that the equations for time-varying parameter x 1,t corresponds to the covariate combination case with the best streamflow simulation performance in terms of the KGE criteria during the corresponding sub-period, i.e., the KGE-best one. The GR4J-T model significantly outperforms the original GR4J model in terms of the KGE criteria and achieve a better water balance simulation in most cases. This is not unexcepted as previous studies have proven the advantage of applying time-varying parameter in streamflow simulation [27,28].  The case of Tuojiang is more complicated since two parameters (x 1 and x 3 ) are assumed to be time-varying. Therefore, four scenarios were considered for each sub-period, including: (1) both parameters are constant, (2) x 1 is constant and x 3 is time-varying, (3) x 1 is time-varying and x 3 is constant, (4) both parameters are time-varying. However, only the results of the first and fourth scenario are presented here in Table 6, because: (1) the calibration procedure is not the main purpose of this study, (2) the second and third scenario show less significant improvement in term of KGE when compared to the fourth scenario. For Tuojiang Basin, it is clear that the GR4J-T outperforms the original GR4J model, though the improvement in terms of KGE is not as significant as the case of Weihe Basin. This may partly be attributed to the fact that the original GR4J can achieve a satisfying performance in most sub-periods for Tuojiang Basin. Table 6. Comparison of streamflow simulation performance of the GR4J model and the GR4J-Model in Tuojiang Basin for all sub-periods (SP1-SP26).

GR4J Model GR4J-T Model
Values of x 1 and x 3

BIAS (%)
Equations for x 1,t and x 3,t KGE (-) The observed streamflow and the uncertainty bounds associated with model parameter for both models during part of SP1 (1982)(1983) are presented in Figure 6 (for Weihe Basin) and Figure 7 (for Tuojiang Basin) for an illustration purpose. The uncertainty bounds associated with model parameter were calculated as follows [59]. For both the GR4J and GR4J-T model, the streamflow simulation for a certain sub-period was repeated using the 5000 parameter sets, and 5000 simulated hydrographs were therefore obtained. Then the likelihood value of each parameter set was assigned to the respective simulated hydrograph. At each time step of the simulation, the posterior distributions of simulated streamflow can be calculated based on the 5000 simulated streamflow and the corresponding likelihood values. The upper and lower uncertainty bounds are here defined as the 2.5% and 97.5% quantiles of the posterior distribution, respectively. Then the 95% predictive uncertainty bounds associated with model parameter for both models were obtained. More detail of the calculation of the predictive uncertainty bounds can be found in the work of Blasone et al. [59]. The red and grey shaded areas indicate the prediction uncertainty associated with the GR4J model and GR4J-T model, respectively. It can be found that most parts of the observed streamflow series lie within the uncertainty bounds for both models, indicating a good simulation performance. In addition, the original GR4J model tends to underestimate the low flows in Weihe Basin, as presented in Figure 6b, where the simulated streamflow series is constantly lower than the observed in November of 1983. However, the average relative bandwidth of the uncertainty bounds associated with the GR4J-T model is wider, which indicates a higher parameter uncertainty. This agrees with the finding of Wallner and Haberlandt [25], who showed that time-varying model parameters can improve the model performance at the cost of possible growth in parameter uncertainty. The growth in parameter uncertainty may also relate to the choice of the external time-varying covariates when applying time-varying parameter, which may be an interesting topic that deserves further exploration. Figure 6a presents the constant value of the parameter x 1 and its temporal dynamic when treated as time-varying for Weihe Basin during part of SP1 (1982)(1983). The constant value was obtained from the KGE-best parameter set of the GR4J model, while the temporal dynamic of x 1,t is presented in the form of boxplot using the 5000 resultant parameter sets of the GR4J-T model. It can be seen that the time-varying parameter x 1,t increases in the growing season and decreases as the dormant season begins, with its values around the constant value of the parameter x 1 . The case is the same for the parameter x 1 and x 3 in Tuojiang Basin, as shown in Figure 7a     The increased value of the parameter 1 x during the growing season indicates larger catchment storage (e.g., interception) and, therefore, a weaker response of the watershed to precipitation inputs. As a result, the same hydrological inputs of precipitation may lead to less streamflow, therefore lower runoff ratio when compared to the dormant season. Likewise, the relatively low value of the parameter 1 x during the dormant season may lead to a stronger response to the precipitation inputs and may help to reduce the possibility to underestimate the low flow, as mentioned above. The The increased value of the parameter x 1 during the growing season indicates larger catchment storage (e.g., interception) and, therefore, a weaker response of the watershed to precipitation inputs. As a result, the same hydrological inputs of precipitation may lead to less streamflow, therefore lower runoff ratio when compared to the dormant season. Likewise, the relatively low value of the parameter x 1 during the dormant season may lead to a stronger response to the precipitation inputs and may help to reduce the possibility to underestimate the low flow, as mentioned above. The parameter x 3 , which controls the routing storage of GR4J model, shows a similar trend with x 1 in the case of Tuojiang. Similarly, the relatively low values of the parameter x 3 during dormant season, could contribute to a more rapid hydrological response and therefore a better representation of the low flow.
The improved streamflow simulation performance can be attributed to: (1) the dynamics in model parameter, which may help to compensate the possible deficiency of model structure, (2) the increased degree of freedom of model parameters. For the GR4J-T model applied in Weihe Basin, the number of parameters that required calibration is 7. The trade-off between a simple model structure and good model performance could be an interesting topic that deserves further researches, especially under changing environments.

Parameter Transferability of the GR4J and GR4J-T Model
To test the parameter transferability of the GR4J and GR4J-T Model, the abovementioned 5000 parameter sets obtained from each sub-period (donor period) were then validated at its independent sub-periods (receiver periods). For example, Figure 8 compares the streamflow simulation performance of the GR4J model and GR4J-T model under 10 different sub-periods using parameter sets obtained from SP1 in Weihe Basin. The corresponding PTC criteria are also presented. As a reference, the red solid line represents the highest value of KGE (KGE c R→R ) achieved by the original GR4J model using parameter from each receiver period, while the blue solid line for the GR4J-T model (KGE t R→R ). The blue boxes represent the KGE values (KGE c D→R ) achieved by the original GR4J model using the 5000 parameter sets obtained from SP1, and the green boxes represent those of the GR4J-T model (KGE t D→R ). It is clear that when parameter sets were transferred from the donor period to the receiver periods, the model performance significantly decreased for both the GR4J model and GR4J model. However, the maximum value of the green boxes (KGE t D→R ) is constantly higher than those of the blue boxes (KGE c D→R ), as shown in Figure 8. Besides, the PTC criteria, which is the difference between the average value of KGE achieved using 5000 parameter sets by both models over each receiver period (see Equation (6)), is above 0 in most cases, which indicates a better temporal transferability of the GR4J-T model. The case is the same for Tuojiang Basin (see Figure 9), though the corresponding PTC values are quite close to 0.   To comprehensively compare the temporal transferability of the constant parameter and the time-varying parameter, the PTC criteria for every possible case is also presented in Figure 10. The blank area indicates the cases where the transferability test is inapplicable (e.g., between adjacent sub-periods or among sub-periods that overlap each other). The labels on the y-axis indicate the 26 donor periods while the labels on the x-axis indicate the receiver periods. Each cell in the figure stands for a transferability test and its color represents the corresponding PTC value. The red cells, indicating positive PTC values, stands for the cases where the GR4J-T model shows a better temporal transferability over the GR4J model, while the blue cells are in the opposite cases. It is clear that the PTC criteria have a positive value in most cases. This further demonstrates that the GR4J-T model has advantage in term of temporal transferability over the original GR4J model. To comprehensively compare the temporal transferability of the constant parameter and the time-varying parameter, the PTC criteria for every possible case is also presented in Figure 10. The blank area indicates the cases where the transferability test is inapplicable (e.g., between adjacent sub-periods or among sub-periods that overlap each other). The labels on the y-axis indicate the 26 donor periods while the labels on the x-axis indicate the receiver periods. Each cell in the figure stands for a transferability test and its color represents the corresponding PTC value. The red cells, indicating positive PTC values, stands for the cases where the GR4J-T model shows a better temporal transferability over the GR4J model, while the blue cells are in the opposite cases. It is clear that the PTC criteria have a positive value in most cases. This further demonstrates that the GR4J-T model has advantage in term of temporal transferability over the original GR4J model.
Note that the loss in model performance when the model parameter was transferred among sub-period with different climate and watershed conditions is inevitable even for the hydrological model with time-varying parameter. Here, it was proven that a better transferability can be achieved under changing environments when applying the time-varying parameter. Though in practice, applying the time-varying parameter is not an easy job, as a more complicated calibration procedure is necessary. However, it is still worth identifying the sensitive parameter and find out how it may change over time, which may help to improve the model structure and to provide new sights for better understanding the hydrological processes. Note that the loss in model performance when the model parameter was transferred among subperiod with different climate and watershed conditions is inevitable even for the hydrological model with time-varying parameter. Here, it was proven that a better transferability can be achieved under changing environments when applying the time-varying parameter. Though in practice, applying the time-varying parameter is not an easy job, as a more complicated calibration procedure is necessary. However, it is still worth identifying the sensitive parameter and find out how it may change over time, which may help to improve the model structure and to provide new sights for better understanding the hydrological processes.

Conclusions
This study investigated the parameter transferability under changing environments of the original GR4J model and its modified version, in which the parameter was allowed to vary over time. To set up the GR4J model with the time-varying parameter, a sensitivity analysis was applied to identify the most sensitive parameter, which was then treated as time-varying and assumed to be a function of several external covariates. Both models were calibrated and validated in a series of sub-periods with different climatic and watershed conditions. The investigation was carried out for Weihe Basin and Tuojiang Basin in western China. The main findings of this study are as follows: (1) A better streamflow simulation performance in terms of the KGE criteria was achieved by the GR4J model with time-varying parameters (x 1 for Weihe Basin, x 1 and x 3 for Tuojiang Basin) during most calibration sub-periods for both basins, though at the cost of slight growth in parameter uncertainty, (2) The GR4J model with time-varying parameter shows a better transferability among sub-periods with different climate and watershed conditions when compared to the original GR4J model for both basins.
Although the time-invariance of model parameters is one of the basic criteria of a high-quality hydrologic model, very few (if any) models can achieve this due to their inherent limitations. Numerous previous studies have proved that hydrological models with time-varying parameters can achieve a better streamflow simulation performance during the calibration period. This study further demonstrates the advantage of the GR4J model with time-varying parameter in terms of temporal transferability among different sub-periods. This may provide new insights for improving the structure of existing hydrological models and for building new models.
Considering the availability of the dataset for both basins, only six climate-related covariates (P 1 , P 2 , P 3, PET 1 , PET 2, and PET 3 ) and 2 watershed-related covariates (NDVI 0 and NDVI 1 ) were accounted for to describe the dynamic variation of the selected parameter. More covariates should be investigated in further researches. Besides, the key factor that leads to the alternation in flow regimes also differs for different catchments. Thus, more emphases should be placed on the careful identification of proper external covariates when applying time-varying parameter.