An Ensemble Climate-Hydrology Modeling System for Long-Term Streamﬂow Assessment in a Cold-Arid Watershed

: Climate change can bring about substantial alternatives of temperature and precipitation in the spatial and temporal patterns. These alternatives would impact the hydrological cycle and cause ﬂood or drought events. This study has developed an ensemble climate-hydrology modeling system (ECHMS) for long-term streamﬂow assessment under changing climate. ECHMS consists of multiple climate scenarios (two global climate models (GCMs) and four representative concentration pathways (RCPs) emission scenarios), a stepwise-cluster downscaling method and semi-distributed land use-based runo ﬀ process (SLURP) model. ECHMS is able to reﬂect the uncertainties in climate scenarios, tackle the complex relationships (e.g., nonlinear / linear, discrete / continuous) between climate predictors and predictions without functional assumption, and capture the combination of snowmelt– and rainfall–runo ﬀ process with a simplicity of operation. Then, the developed ECHMS is applied to Kaidu watershed for analyzing the changes of streamﬂow during the 21st century. Results show that by 2099, the temperature increment in Kaidu watershed is mainly contributed by the warming in winter and spring. The precipitation will increase obviously in spring and autumn and decrease in winter. Multi-year average streamﬂow would range from 105.6 to 113.8 m 3 / s across all scenarios during the 21st century with an overall increasing trend. The maximum average increasing rate is 2.43 m 3 / s per decade in October and the minimum is 0.26 m 3 / s per decade in January. Streamﬂow change in spring is more sensitive to climate change due to its complex runo ﬀ generation process. The obtained results can e ﬀ ectively identify future streamﬂow changing trends and help manage water resources for decision makers.


Significance and Motivation
Climate change is one of the most profound global changes in the 21st century that has brought about a series of impacts on the stability of earth system and human beings [1,2]. According to the Intergovernmental Panel on Climate Change Fifth Assessment Report (IPCC AR5), the global mean surface temperature is likely to increase by 0.3-4.8 • C relative to the base period of 1986-2005 [3,4].
At the same time, climate change also brings about substantial alternatives of precipitation in the spatial and temporal patterns. These changes would have implications for evaporation, snowmelt, infiltration and runoff, altering the hydrological cycle and causing flood or drought events [5]. Accompanying with the increasing demands for water resources in recent years, the security of water resources is challenged by climate change [6]. For pursuing sustainable water resources management in the context of climate change, it is urgent to clearly understand how climate change influences streamflow generation from the perspective of water supply.

Literature Review
Numerous studies have been conducted to quantitatively investigate climate change impacts on streamflow in recent years [7]. For example, Qin and Lu [8] explored Heshui watershed's flood frequency during the 21st century, where the Long Ashton Research Station Weather Generator was used for processing the outputs of global climate model (GCMs) and then the semi-distributed land use-based runoff processes (SLURP) model was used for predicting future streamflow. Umut and Okan [9] evaluated the streamflow of the Izmir-Tahtali freshwater basin; where an artificial neural networks-based downscaling model was employed to produce the fine-scale precipitation and temperature from GCMs; the downscaled outputs were transformed to runoff by means of a monthly parametric hydrological model. Zhou et al. [10] investigated the streamflow response to climate change in the Lake Dianchi watershed; the statistical downscaling model (SDSM) and the soil and water assessment tool (SWAT) model were used to connect the outputs of GCMs and streamflow. Eldho and Ghosh [11] incorporated two GCMs, the nonparametric kernel-regression-based statistical downscaling model, and the variable infiltration capacity (VIC) model into an ensemble for streamflow prediction in Godavari River basin. Gorguner et al. [12] developed a hydro-climate model by coupling the weather research and forecasting (WRF) model to the physically based watershed environmental hydrology (WEHY) model for obtaining the projected future inflows to Demirkopru Reservoir.
Generally, a common method for the climate change impact analysis of hydrology mainly focuses on pairing climate variables from GCM with hydrologic models. Data-driven and physically process-based models are frequently used in climate downscaling and hydrological simulating. Statistical or dynamic downscaling methods are employed for refining the coarse resolution outputs of GCM and reducing the scale gap between GCMs and hydrological models. In practical application, the statistical downscaling approach is widely used due to its less computational efforts [12]. Most statistical downscaling approaches are established based on empirical mathematical functions relating GCM resolution climate variables and local observation data, which also require long, standardized observational time series of data for fitting and validating the statistical relationship [13,14]. As for hydrological modeling, distributed physically based models (e.g., SWAT and VIC) are more popular as it can provide detailed information about the flow characteristics within the watershed [15,16]. However, these distributed physically based models often need amounts of parameters that are difficult to obtain due to the nonlinearities of processes and spatial heterogeneity [17].

Research Gap
Kaidu watershed is one sub-basin of Tarim basin (the largest inland basin in China) with cold-arid climate characteristics. This watershed is less interpreted by human activities and mainly covered by grassland. The river originates from the Southern Tienshan mountain and the streamflow is contributed by both snowmelt and precipitation. Though several research works related to climate-change impact analysis have been conducted in the Kaidu watershed, there are still some challenges [18][19][20]. This is a macroscale watershed (about 12 × 10 3 km 2 ) with complicated topography and varied elevation, leading to the complex climate and hydrological systems and plagued with tremendous variabilities. As a result, the relationships between large-scale atmospheric variables and the corresponding watershed-scale climate factors may be varied, which cannot be reasonably expressed through either linear expressions or continuous mathematical functions [14,21]. This is a data-scarce region that the national metrological stations and hydrological stations only include Bayanbulak and Dashankou station. Long and continuous records of high-quality observed local hydro-meteorological variables (e.g., temperature and precipitation) may be lacking in the cold-arid mountainous region. The scarce data brings great challenges to the usages of downscaling and hydrological models that largely depend on long, and continuous data. As a result, these challenges call for a proper climate-hydrology model that can not only run without complicated data sources, but also simulate snowmelt process.

Novelty and Objective
Stepwise cluster analysis (SCA) is a nonparametric statistical tool in tackling discrete and nonlinear systems based on a multivariate analysis of variance [22]. It can describe the complex relationship between the predictors and predictands as a cluster tree, without the assumptions of functional relationships [23]. Therefore, the SCA-based statistical method can be used for climate downscaling in Kaidu watershed when the relationship between the large-scale atmospheric variables and watershed-scale climate factors is nonfunctional and discrete, as well as the data are not temporally continuous. From the perspective of hydrological model, SLURP is a semi-distributed model, which is able to simulate the physical process of runoff generation with a simplicity of operation. The SLURP model can not only simulate the process of snowmelt with a degree-day method in meso-and macro-scale watersheds, but also avoid the data and computational demands of the fully distributed models [8]. Consequently, the combination of SCA and SLURP model is promising in exploring the hydrological responses to climate change in the Kaidu watershed. To our best knowledge, few studies have been conducted by combining the two models for this watershed.
Therefore, the objective of this study is to propose an ensemble climate-hydrology modeling system (ECHMS) for the assessment of streamflow response to climate change. ECHMS integrates GCMs, the SCA downscaling method and SLURP model into a framework; each model has a unique contribution in the climate-change impact analysis. In detail, ECHMS is able to (1) reflect the uncertainties associated with climate models and emission scenarios due to the differences in model physical mechanisms and human activities; (2) tackle the complex relationships between predictors and predictions without functional assumption; (3) capture the snowmelt process and simulate the hydrological process with a simplicity of operation; and (4) generate long-term streamflow under a changing climate for sustainable water resources management. Then, the developed ECHMS is applied to the Kaidu watershed, a cold-arid region in northwest China for predicting streamflow in the 21st century. The obtained results are expected to help managers gain reliable information on water resources and make adaptive decisions.

Material and Methods
In this study, an ECHMS was developed and applied to Kaidu watershed. The detailed framework of ECHMS is displayed in Figure 1. In ECHMS, eight climate scenarios (two GCMs and four representative concentration pathways (RCPs)) are set for gaining future climate change data and reflecting the uncertainties in climate models caused by physical mechanisms and human activities. The stepwise cluster analysis technique is applied to downscaling raw GCMs into site-scale, and then the downscaled data (e.g., temperature and precipitation) can represent the possible climate variations of Kaidu watershed. These climate data will be forced into calibrated SLURP model to generate future streamflow. Finally, future streamflow will be assessed to point out the characteristics of Kaidu watershed in the 21st century.

Problem Statement of Study Area
Kaidu watershed is a cold-arid region, located between 42 • 14 N-43 • 21 N and 82 • 58 E-86 • 05 E in northwest China (as shown in Figure 2). The watershed covers an area of 18,827 km 2 with a large contrast in elevation (from 1400 to 4778 m a.s.l.). As recorded in the meteorological station of Bayanbulak (with an elevation of 2487 m a.s.l.), the annual temperature of this basin is −4.5 • C, and the minimum temperature is about −48.1 • C. It has an average annual pan evaporation of 1100 mm and precipitation of 262.6 mm [24]. Due to the variation in elevation, this basin is characterized with strong gradients in both temperature and precipitation. The annual temperature and precipitation in Dashankou Station (with an elevation of 1400 m a.s.l.) are 7.5 • C and 99.6 mm, respectively. Besides, precipitation also varies with seasons and more than 80% of the annual precipitation occurs from May to September. The gross annual amount of surface water resources of the basin is approximately 3.3 × 10 9 m 3 .
Due to climate change and human activities, the annual temperature and precipitation in Kaidu watershed has increased with decadal rates of 0.34 • C and 6.07 mm in the last century. If the increase in precipitation falls short of the temperature increase, evaporation may intensify and thus aggravate drought. Because the climate is an important driver of the hydrological cycle, the warming climate inevitably alters the distributions of water resources. In particular, temperature can directly affect glaciers, the most important source of water resources in Kaidu watershed, and thus can significantly affect the water resources and environmental conditions. Therefore, evaluating the future climate variation is very important for ecological environment and water resource management in Kaidu watershed.

Data Collection and Analysis
The datasets include (1) the GCMs used for climate projection, (2) the observed meteorological data used for both climate projection and hydrological model running, (3) spatial data for watershed delineation (i.e., DEM (digital elevation model) and land cover), and (4) hydrometric data for streamflow simulation. In detail, the GCMs (with periods of 1985-2000 and 2010-2099) used in this study were HadGEM and MIROC, under RCPs of 2.6, 4.5, 6.0 and 8.5, which were downloaded in the Coupled Model Inter-comparison Project (CMIP5) from the program for Climate Model Diagnosis and Inter-comparison (PCMDI) website (http://www-pcmdi.llnl.gov). The predictors (including surface temperature, wind speed, surface upwelling longwave radiation, etc.) ranging from 1971-2000 were obtained from the National Centers for Environmental Prediction (NCEP) for SCA calibration and validation [21]. DEM with resolution of 90 m are obtained from the Geospatial Data Cloud Website (http://www.gscloud.cn). Land cover types in the year 2000 were prepared by the Resource and Environmental Sciences Data Centre Chinese Academy of Sciences (http://www.resdc.cn). Meteorological data ranging from 1971-2010 (including daily temperature, precipitation, wind speed, and sunshine hours) were obtained from two stations (i.e., Bayanbulak and Danshankou). Due to the spatial heterogeneity of temperature and precipitation, the temperature input for elevation differences is derived with a lapse rate of 0.75 • C per 100 m, and precipitation is increased by 1% per 100 m based on the data obtained from the monitoring stations [25]. The streamflow data (from 1971 to 2010) for the watershed were collected from Dashankou hydrometric station.

Stepwise Cluster Analysis
SCA is an efficient statistical tool that can establish the relationship between the predictors and predictands. It can deal with both continuous and discrete variables, as well as the nonlinear relationships between the variables. It divides the sample sets of predictors (i.e., the data from NCEP) and independent variables (i.e., temperature and precipitation) into different subsets (or sub clusters) through a series of cutting and merging operations [22]. The processes of SCA for climate downscaling can be divided into the following steps: (1) Establishment of cluster principles, the criteria for the cut and merge operation are based on Wilk's statistic. According to Wilk's likelihood ratio criterion, if the cutting point is optimal, the value of Wilk's ( Λ =|W|/|W + H|) should be the minimum, where W and H are the within and between the groups' matrix, respectively [26].
(2) Tests of optimal cutting points Assuming the optimal cutting point of cluster h is k * r * , and the relevant value of temperature or precipitation is x (h) r * ,k * r * ., then the F-test can be undertaken. (4) Prediction. After all the calculations and tests have been completed when all the hypotheses of further cut or mergence are rejected, a cluster tree can be derived for each dependent variable. Then, the predictors derived from the GCMs will be forced into the cluster tree, and the predicted dependent variables (i.e., temperature and precipitation) can be obtained (i.e., y i = y

SLURP Hydrological Model
The hydrological model used in this study was SLURP, which is a continuous, spatially distributed basin model to simulate the hydrological cycle from precipitation to runoff [27]. A watershed is divided into several aggregated simulation areas (ASAs) based on DEM maps through Topographic Parameterization (TOPAZ). SLURP conceptualizes each ASA into four storage tanks (canopy storage, snow storage, fast storage and slow storage), representing canopy interception, snowpack, aerated soil storage and groundwater, respectively [28]. SLURP conducts a vertical water balance based on each of the land covers in the ASAs at a daily time step. At each time step, the model is applied sequentially to a matrix of ASAs and land covers. Each ASA must contribute runoff to an identifiable stream channel, which is connected to the watershed outlet.
In this study, the Kaidu watershed is divided into 183 ASAs. The weather data of each ASA (i.e., precipitation, temperature, dewpoint temperature, global radiation) is calculated based on local weather station using the Thiessen polygon method. In each ASA, precipitation is intercepted by the canopy or evapotranspiration, and any excess falls to the ground or to a snowpack depending on the air temperature. The evapotranspiration of the rainfall is calculated using the Penman-Monteith method of the Food and Agriculture Organization (FAO) [29]. If a snowpack exists and the temperature exceeds a critical value, the snowmelt is computed using a simple degree-day method. Rainfall and any snowmelt infiltrate through the soil surface into the fast store depending on the current infiltration rate. If the precipitation factor exceeds the maximum possible infiltration rate, the surface runoff is generated. Runoffs are accumulated from each land cover within an ASA using a time/contributing area relationship for each land cover type [24,27,30]. Manning's equation is used to compute travel times for each land cover and estimate the velocities for travel both up-stream and down-stream. Then, the combined runoffs route to the next sub-basin in the way of hydrological storage routing Q = αR β , Q (m 3 /s) is the outflow, R is the combined runoffs routed into the channel in the sub-basin, α and β are the parameters specified to give the degrees of lag and attenuation required.
Before future projection, the model is calibrated and verified using observed data. The calibration of SLURP is conducted during 1971-1995 by an automatic method using the Shuffle Complex Evaluation algorithm developed at the University of Arizona [31]. Then, the model was validated during the period 1996-2010 using the values of calibrated parameters. Nash-Sutcliffe efficiency (NSE), coefficient of determination (R 2 ), and the deviation of volume (DV) were used to address the goodness of fit of the performance of the hydrological model. The validated results can be found in Sun et al. [2], which shows that the model is suitable for future projections. Then, the projected future temperature, precipitation, sunshine hours, and solar radiative are forced into the validated model for streamflow predictions. The differences in temperature are mainly attributed to the varied physical mechanisms and conditions of climate models and RCPs. From the perspective of RCPs, temperature increment is the largest under RCP 8.5 and the smallest under RCP 2.6. This is due to the fact that RCP8.5 is a non-climate-policy scenario, resulting in severe changes in climate and RCP2.6 represents a rigorous climate policy, mostly to limit greenhouse gas emissions and, accordingly, low climate change impacts. The greenhouse emission scenarios under RCP 4.5 and RCP 6.0 are similar to the current situation, which means that if current industrial activities remain unchanged, future temperature would increase by nearly 3 • C. From the time-scale, temperature difference among the four RCPs is weak in the early 21st century and gradually increases with time. As shown Figure 3b, the monthly temperature changes in the 2080s reveal that the temperature in July and August (summer) do not change much. In November, December, January, February, March and April, the temperature changes are relatively large. For example, in Dashankou, the temperature increases in December are 4.69 • C (RCP4.5 of HadGEM). The results suggest that the warming in the Kaidu watershed is mainly contributed by the warming in winter and spring, resulting in a smaller temperature difference during the year. As a result, the duration of summer may increase, and the winter may become shorter. It shows that precipitation variation in Bayanbulak is greater than that in Dashankou. This is because Bayanbulak is located in a cold mountain area at high altitude, which is more sensitive to climate change. Besides, the projected precipitation under MIROC is higher than that under HadGEM. Annual precipitation in most years of the 21st century is about 150-300mm higher than the baseline under MIROC, and 50-150 mm higher than the baseline under HadGEM. This shows that different GCMs have a large amount of predicted precipitation due to the inconsistency of physical mechanisms and initial parameters. Besides, GCM's ability to predict precipitation is still relatively poor, and there is a large deviation. The seasonal changes of precipitation summarized in Figure 5 show that precipitation in spring and autumn in both stations would increase. However, the summer and winter precipitations present decreasing trends. This means that in the Kaidu watershed, abundant precipitation in spring and autumn would result in a balanced distribution of rainfall within the year. The decreasing winter precipitation indicate that winter will become more arid, and the possibility of seasonal drought time will increase.  In general, the streamflow in the RCP2.6 scenario is the least, and the streamflow in the MIROC model is higher than that in the HadGEM model, which is consistent with the changing trends of temperature and precipitation. By comparing various scenarios, the possible streamflow range of the Kaidu River in the future can be obtained, avoiding the errors caused by the prediction of a single scenario. However, by the middle or end of this century, the increase becomes unobvious, and even shows a downward trend. For example, under the RCP2.6 scenario, the projected streamflow would first remain flattened during the period 2010-2025. This trend may be due to an experiment error by the SCA method. In the downscaling process, SCA may dilute the peak values of climate variables, resulting the projected temperature and precipitation being downscaled into a cluster with similar values. Then, the streamflow would slightly increase before 2050s; after 2050s, a decreasing trend would be observed. Streamflow under RCP4.5 and RCP6.0 have a stabilized trend and then decline slightly after 2080; streamflow under RCP8.5 has an obvious increasing trend before 2090; after that, a slight downward trend would be observed. The increase in streamflow is contributed by increments in precipitation and temperature. Kaidu is a river that is supplied by precipitation and ice and snow streamflow, and the increase in temperature leads to an increase in snowmelt streamflow. The slight decrease in streamflow under RCP4.5 and RCP6.0 around 2080 is caused by precipitation and temperature. In these two scenarios, the precipitation increased slightly, and the temperature shows a steady state and is about 2 • C higher than the current temperature. The increase in temperature also leads to increased evaporation, which caused a slight decrease in the streamflow. The downward trend of streamflow in the RCP8.5 scenario is determined by temperature. This is because the precipitation in this scenario is increased, and the temperature is 4 to 5 • C higher than the baseline. The evaporation is very large, far exceeding the precipitation, which would lead to a decrease in streamflow.   Figure 7 presents the annual streamflow distribution during the 2020s, 2050s and 2080s. Results show that the difference in streamflow among the RCPs in the 2020s is the smallest, followed by the 2050s and 2080s. For example, in the 2020s, the average streamflow under HadGEM is around 103 m 3 /s, while in the 2080s, the average streamflow fluctuates between 106.2 and 121.9m 3 /s. Such deviation is similar with those of temperature and precipitation. These results suggest that uncertainty in different climate scenarios exists, and it would be amplified along with time. Figure 8 describes the changes in streamflow in different months. January has the lowest average multi-year monthly streamflow (42.1 m 3 /s) and July has the highest one (176.1 m 3 /s). Besides, the streamflow in all months shows overall increasing trends. The largest increasing rate is 2.43 m 3 /s in October and the smallest increasing rate is 0.26 m 3 /s in January. In general, the increasing rate is high in autumn and spring, and is low in wither and summer, illustrating that the annual streamflow increment is mainly attributed to autumn and spring. Results also show that the ranges of streamflow change in July and August is relatively small. This not only shows the accuracy of climate prediction for summer streamflow, but also shows that the response of summer streamflow change to climate change is relatively stable. However, the results of spring (March, April and May) show a great difference among scenarios. For example, in April, the multi-year average of the minimum predicted streamflow in all scenarios is 58.2 m 3 /s, while the multi-year average of the maximum is 128.3 m 3 /s, with a difference of up to 120%. This illustrates that the spring streamflow is more sensitive to climate change. This may be due to the complexity of streamflow generation in spring. The Kaidu river is supplied by snowmelt and precipitation streamflow, so there will be two peaks of streamflow in a year, respectively, in spring and summer. Due to the combined influence of snow depth and temperature, snowmelt streamflow is more complicated than precipitation streamflow. Therefore, there is a large uncertainty in the interpretation of snowmelt streamflow in different climate scenarios.

Conclusions
In this study, an ECHMS has been developed for assessing the streamflow in Kaidu watershed. The ECHMS consists of multiple climate change scenarios, the SCA and the SLURP model. The modeling system not only reflects the uncertainty in climate models caused by heterogeneity in physical mechanism and initial parameters, but also reflects the uncertain information caused by greenhouse gas emissions. The SCA downscaling method can effectively handle the complex non-linear and discrete relationships between climate elements and overcome the functional hypothesis of conventional statistical methods. The SLURP model is a semi-distributed physical model that can effectively model streamflow processes by snowmelt and rainfall with simplicity of operation.
Results show that by 2099, the temperature in the Kaidu watershed will increase, with the RCP8.5 scenario having the largest increase, reaching more than 5 • C. Annual warming is mainly contributed by the warming in winter and spring, resulting in a smaller temperature difference during the year. This may increase summer duration and shorten winter duration. By the end of this century, the precipitation shows an increasing trend. Precipitation in spring and autumn will increase more obviously, resulting in the difference in precipitation between spring, summer and autumn. As a result, the distribution of rainfall within the year will be balanced. The precipitation in winter has a decreasing trend, indicating that winter will become more arid. In the future, there is an overall increasing trend of streamflow with a range from 105.6 to 113.8 m 3 /s. The increasing rate is high in autumn and spring and is low in winter and summer. The maximum average increasing rate is 2.43 m 3 /s per decade in October and the minimum is 0.26 m 3 /s per decade in January. This illustrates that the annual streamflow increment is mainly attributed to autumn and spring. Summer streamflow response to climate change is relatively stable, while spring streamflow change is more sensitive to climate change.
Through the above results, the temperature, precipitation and streamflow of Kaidu watershed in 2010-2099 were identified, which is beneficial for local water management. However, there are still some limitations in this study. For example, there are many GCMs developed by many institutes; however, only two GCMs are explored in this study to explore their uncertainties. This may be not enough to provide more accurate projection results. Thus, more GCMs and ensemble outputs of them should be adopted in future studies. Besides, this study focuses on a data-scarce region with only two meteorological stations. This may dilute the heterogeneity in climate features in the large watershed and bring errors in streamflow prediction. In future studies, multiple-source meteorological data (e.g., reanalysis data and remoting data) are suggested to capture higher-resolution climatic and hydrological characteristics.