2.1. Definition and Division of Study Area
As for the scope of the Qinghai-Tibet Plateau, there are different definitions in different studies. We adopted the high-frequency use scope of the QTP (HF-QTP) as defined by Zhang et al. [
33], which included several major river headwaters and can match most research requirements. The QTP ranges from 25°59′37″ N to 39°49′33″ N and from 73°29′56″ E to 104°40′20″ E, covering an area of 2542.30 × 10
3 km
2 [
34]. Its altitudes range from 85 m to 8233 m, and approximately 90.9% belong to high altitude (>3000 m) areas. This range includes the world’s highest mountains and massive deep valleys in terms of topography. The complex terrain of the QTP results in a complicated distribution of precipitation and surface temperature. A brief overview can be seen in
Figure 1. These regions can be divided into ten subregions based on watershed [
35,
36], as follows: I. Tarim River (TMR); II. Qaidam Basin (QDB); III. Hexi Basin (HXB); IV. Yellow River (YR); V. Inner of Qiangtang Basin (IQTB); VI. Yangtze River (YZR); VII. Indus River (IDR); VIII. Salween River (SWR); IX. Mekong River (MKR); and X. Yarlung Zangbo (YLZBR). Among these, IQTB and QDB belong to the inland river basin, and some of the river channels are seasonal. We obtained the river distribution of QDB from National Basic Geographic Information [
37]. Currently, there are only 131 long-observed meteorological stations in the QTP (
Figure 1) maintained by the China Meteorological Administration (CMA). They are primarily situated in the eastern part of the QTP, with observations lacking in the western part of the plateau. The digital elevation model (measured from sea level, DEM, units: m) used in this section and
Section 2.5 is provided by the National Tibetan Plateau Data Center “
https://data.tpdc.ac.cn/ (24 March 2021)” [
38].
In
Figure 1, the lower Yarlung Zangbo valley (LYZV) is highlighted with red dots. Its boundary is defined by the watershed of lower Yarlung Zangbo and is the joint part of the Yarlung Zangbo River and Brahmaputra River. Unlike other regions in the QTP, LYZV belongs to the south of the Himalayas, is controlled by the Indian monsoon [
39,
40,
41,
42,
43], and has greatly varied elevations (from 85 m to 6941 m). Because of water vapor transport by monsoons and the uplift effect of the valley, precipitation in LYZV is greater than in other regions [
44]. Although LYZV is an ungauged region, we can estimate its annual precipitation ranging from 2500 mm to 3500 mm by using observations from similar neighborhoods in the south of the Himalayas and early investigation findings [
43,
45,
46,
47]. Thus, it is essential to provide reasonable estimations in LYZV. Special methods are required to deal with meteorological data in LYZV. We designed a method that combines two high-resolution grid datasets to reduce the underestimation of precipitation, which can be seen in
Section 2.3.
2.2. Historical Meteorological Data
- (1)
Observation data from CMA.
As
Figure 1 demonstrates, the Qinghai-Tibet Plateau is indeed a data-sparse region, with limited and unevenly distributed meteorological stations across the area. In the CMA published ground observation data (from “
http://data.cma.cn (24 March 2021)”), there are about 147 observation meteorological stations with continuous long series on the Qinghai-Tibet Plateau, and the density of stations is 0.058 per 1000 km
2. Ma et al. [
12] have demonstrated the impact of overly sparse sites on downscaling representativeness, so in previous studies, CMA site observations were generally used to validate reanalysis methods and results [
13,
14,
15,
19,
20]. In the present study, CMA data were used to verify the availability of the downscaling method (
Section 2.6). Among them, 8 stations are used as verification stations to represent climate characteristics at different altitudes (white dots in
Figure 1) and the remaining 139 stations are interpolation stations (black dots in
Figure 1). The precipitation and temperature data of interpolation stations from 1979 to 2013 were interpolated into the verification station grids (the grids are consistent with HRP-QTP) by different methods to compare the fitting effects of different interpolation methods on the verification stations. Further, a suitable interpolation method is selected for downscaling. Basic information on the verification stations is shown in
Table 2.
- (2)
Reanalysis data.
It is crucial to select appropriate meteorological data for the historical period because multi-model ensemble prediction and bias correction are highly dependent on historical data. In other words, historical data determine whether the output reanalysis data are reasonable and available. We used two high-resolution reanalysis (0.1 × 0.1°) datasets from He et al. [
11] (the China Meteorological Forcing Dataset, CMFD, from 1979 to 2013) and Ma et al. [
12] (Gridded Precipitation for Quantile Mapping, GPQM, from 1980 to 2009) as historical data in our study. Data were downloaded from the National Tibetan Plateau Data Center “
http://data.tpdc.ac.cn (24 March 2021)”; basic information about CMFD and GPQM is presented in
Table 3.
2.3. Combination of Reanalysis Data
CMFD provides a high-resolution meteorological forcing dataset across China and has been evaluated for its temperature performance in western China. It shows better performance in surface temperature results in the QTP’s ungauged regions compared to other reanalysis data [
11]; hence, the present study uses CMFD as the historical temperature source directly. However, CMFD precipitation is unverified in the western part of the QTP because of the lack of surface observational data, and it shows obvious bias in specified areas. What stands out is that it underestimates the precipitation in the LYZV compared with observations from ground stations in the Motuo and its immediate surrounding areas [
48]. Hence, we adopt GPQM as a historical precipitation source in LYZV, which is validated with no obvious bias. GPQM considers observation stations both inside and surrounding the QTP [
12]. It includes decades of gauge data from the south of the Himalayas, which can comprehensively reflect the precipitation characteristics of the QTP. To align the GPQM with the historical period of the HRP-QTP, we extend it using CMFD, and the extended method is shown in
Figure 2.
Figure 2 employs three background colors to denote grid data from distinct sources: orange for GPQM data, purple for CMFD data, and green for the final 0.1° × 0.1° grid output of QTP’s precipitation data. Initially, we extracted QTP’s precipitation data, excluding the LYZV areas, from CMFD to serve as the historical precipitation source for other regions. Then, precipitation data for the LYZV were extracted from GPQM to the same grids as CMFD. Given that GPQM and CMFD share the same resolution, we applied GPQM data grid by grid to correct CMFD within the LYZV region. This correction allowed us to extend the GPQM data, ensuring that precipitation data from both sources matched the time dimension. We utilized an updated nonstationary CDF-matching method (CNCDFm), detailed in
Section 2.5, for this correction. Finally, the extended GPQM precipitation data for LYZV were combined with the CMFD-extracted precipitation data for other regions, creating a comprehensive historical precipitation dataset for the entire QTP. This combined dataset will serve as the observed precipitation in subsequent steps.
2.4. Simulation and Prediction Data from GCMs
Seven CMIP6 GCMs were used to produce HRP-QTP (See
Table 4). The GCM outputs were obtained from “
https://esgf-node.llnl.gov/search/cmip6/ (24 March 2021)”. These GCMs were selected because precipitation, maximum temperature, and minimum temperature data were available for the 7 GCMs across most future scenarios. Moreover, they represent different resolutions, from 1.12° to 2.5°. In an assessment by Lun et al. [
28], the 7 GCMs overestimate the precipitation of the QTP by 0.8–1.2 mm d
−1 in the historical period, compared with CMFD. We interpolated them to a resolution of 0.1° and used historical data to correct their systemic bias. The CMIP6’s global simulation results of these climate models come from institutions worldwide, using different modeling methods. Compared with the representative concentration pathways (RCPs) of CMIP5, CMIP6 developed a new framework called shared socioeconomic pathways (SSPs) to quantitatively describe the relationship between climate change and socioeconomic development [
26]. In this study, we selected SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5 to represent different development and concentration pathways. These scenarios correspond to a sustainable development scenario, a medium challenge scenario for mitigation and adaptation, a high challenge scenario for mitigation and adaptation, and a high radiative forcing scenario with fossil fuels in the future period [
49]. The four SSPs present varied global scenarios of radiative forcing, ranging from low to high. SSP1-2.6, SSP2-4.5, and SSP5-8.5 each align with the radiative forcing assumptions of RCP2.6, RCP4.5, and RCP8.5, respectively. In contrast, SSP3-7.0 serves as an alternative to the high radiative forcing scenario. Although it projects lower greenhouse gas concentrations than SSP5-8.5, it suggests that human society will face a greater risk of climate hazards [
50]. We chose the first 3 SSPs to align our results with those of CMIP5 and SSP3-7.0 to represent a more likely high radiative forcing scenario.
2.5. Methods of Processing CMIP6 Data
Methods used to process CMIP6 data are divided into three steps: downscaling, multi-model ensemble forecasting, and bias correction. At first, we used a statistical method to downscale 7 GCM outputs from different resolutions. Secondly, a prior-based method was used to generate an ensemble forecast from GCMs. Finally, all results from the downscaling and multi-model ensemble were corrected by a nonstationary method. Details of the three steps are as follows:
Most previous studies used linear interpolation methods to downscale GCM data and ignored the relationship between altitude and meteorological elements [
2,
4,
25]. However, orographic studies demonstrated an approximately linear correlation between precipitation and temperature with elevation in high-altitude regions [
51]. Several studies showed that it is necessary to consider altitude when generating high-resolution meteorological forcing [
52,
53]. Hence, we utilized ANUSPLIN (V4.36) to downscale the GCM data. ANUSPLIN is a software package that is based on the theory of thin-plate smoothing splines, and is an extension of linear regression, which considers distance, altitude, and other variables. Ma et al. [
12] confirmed that ANUSPLIN is more accurate than inverse distance weighted and ordinary kriging methods in monthly precipitation interpolation over the QTP. In this study, ANUSPLIN was used to downscale daily precipitation and surface temperature, and altitude was considered a covariate in ANUSPLIN. More details about ANUSPLIN can be seen in the paper from M.F. Hutchinson and P.E. Gessler [
54].
- (2)
Multi-model ensemble forecast.
Due to the uncertainty in GCMs, the results of different climate models are widely distributed, and it is necessary to provide certain projection results for specific studies [
55]. In this research, Bayesian model averaging (BMA) was used to produce a multi-model ensemble forecast result based on 7 GCM outputs in the historical period and 4 SSPs. BMA is a statistical method that is based on the posterior probability density functions (pdfs) of model predictions [
56]. BMA, which is different from the multi-model average (MMA), could be described as a weighted average, as follows:
where
MBMA is the output result of
BMA,
Mk is the output of model
k, and
is the weight of model
k, denoting the posterior probabilities of model
k conditioned. If we assume that the actual climate change process,
y, is a combination of
k model outputs,
could be estimated based on fitting the data to the following relationship:
where the left side of Equation (2) denotes the pdf of the truth value in the historical period, and
denotes the conditional probability distribution of
; it represents the probability that
y follows the pdf of the output from model
k. When
is assumed to follow a normal distribution, it could be estimated as follows:
with
and
representing the
y and
values at time,
t, respectively, and
is the variance of
. In this study, a Markov Chain Monte Carlo (MCMC) method proposed by Jasper et al. [
57] was used to sample the weights of 7 GCM outputs. The weights were calculated based on historical GCM outputs in every 0.1° grid to maintain the spatial distribution characteristics of meteorological variables. The grid’s daily temperature and precipitation data obtained in
Section 2.3 were used as truth values. BMA, which differs from taking the mean value of several models, considers the model’s performance during the training period, which could reduce the uncertainty of the ensemble result effectively [
56,
57,
58]. MMA was also calculated in this study to compare with BMA.
- (3)
Bias correction of nonstationary sequence.
Because climate processes are complex and cannot be described accurately by existing methods, all GCMs still have considerable biases in their output results. Previous studies have adopted many statistical methods to eliminate the systematic errors of the model, such as linear and nonlinear corrections or cumulative distribution functions (CDFs) [
31,
59,
60]. Moreover, it is important to retain nonstationary information in future scenarios and avoid the occurrence of abnormality during bias correction. In this research, we adopted a nonstationary bias correction technique proposed by Miao et al. [
61], called updated nonstationary CDF-matching (CNCDFm), to correct downscaled GCM data and BMA results. CNCDFm model methods can be presented as follows:
where
x denotes the meteorological variable from historical observations (
o, denoted here as the reanalysis data obtained from
Section 2.3) or models (
m) during the historical period (
h), current climate (
c), or future projection (
p) periods,
F(·) is the CDF for a variable, and
F−1(·) is its inverse. CNCDFm is an updated version of the CDF technique. By combining the method by Li et al. [
62] with the method from Wang and Chen [
63], it effectively retained nonstationary information in future scenarios and avoided the occurrence of abnormality. For instance, this approach could avoid negative values for precipitation or unreasonably high values (when the denominator on the right-hand side of Equation (6) is very small). Miao et al. [
61] provided validation within the CMIP5 to approve the method’s efficiency.
Estimating the probability distribution of CDF is the key point of CNCDFm. We used the empirical cumulative distribution function (ECDF) to estimate the CDF in all periods for robust application. We used the 1979–2013 period as a training period to estimate
Fo(·) and
Fm-p(·) by the ECDFs of observed data and downscaled model output (obtained from
Section 2.5 “Downscaling”). The wet day threshold was set at 0.1 mm/day to correct for the drizzle effect in precipitation data [
31,
64]. Model simulations in the historical period and the output of BMA were corrected in this step.
2.6. Effectiveness Evaluation of Methods
- (1)
Applicability test of downscaling method.
This study designed an experiment to compare the effectiveness of different interpolation methods. Using CMA historical observations, 8 stations (white dots in
Figure 1) were used for validation due to their long and complete series of observations. Observation data from the remaining 139 stations (black dots in
Figure 1) were interpolated to the validation stations using three downscaling methods, namely, ANUSPLIN, inverse distance weight (IDW) [
65], and original Kriging (OK) [
65]; the latter two methods do not take altitude into consideration, which is used to compare with ANUSPLIN.
Here, IDW is used to represent linear two-dimensional interpolation, defined as follows:
where
represents the interpolation result at verification station
,
represents the observation from interpolation station
i;
n is equal to 139, and
is the planar Euclidean distance in kilometers between
and station
i.
OK is used to represent non-linear two-dimensional interpolation, which is equivalent to thin plate spline interpolation without considering covariates. In this paper, OK is also based on ANUSPLIN; for technical details, please refer to Hutchinson and Gessler [
54].
The interpolation results were compared with ground observation data by the Wasserstein distance (WD) [
66] and quantile-quantile (Q-Q) plot. WD, also named the earth mover’s distance (EMD), is able to measure the distribution similarity between two samples very efficiently [
67]; it could be defined as follows:
where
denotes all possible joint probability distributions between
and
,
indicates the greatest lower bound, and
denotes the expectation of distance between samples
and
(which follow
P1 and
P2 distributions, respectively, representing the interpolated result and the observation at a verification station) when their joint distribution follows
γ. In other words, WD indicates the minimum distance needed to move
into the same distribution as
. The smaller the WD, the closer the probability distribution between the interpolation results and the observations. The interpolation method with the minimum WD means that it most effectively captures the statistical distribution characteristics of the verification station.
Due to the inherently high randomness and uncertainty of weather systems, traditional Euclidean metrics, such as the mean absolute error (MAE) and root mean square error (RMSE), are limited to evaluating the effectiveness of interpolation methods [
12]. This is because almost all interpolation methods struggle to capture the daily variations at unknown points [
68]. In contrast, evaluating the fitting distribution characteristics through WD is more meaningful [
69]. Long-term climate change studies may not necessarily require precise daily precipitation or temperature data for specific days, but accurate probability distribution characteristics within specific time periods are essential. This is crucial for research on extreme statistics and is important for both historical simulations and future predictions. Therefore, we adopt the WD as the evaluation metric.
- (2)
Statistical validation of bias correction.
To evaluate the absolute mean bias (AMB) of each model before and after correction, we verified the validity of the correction method. The evaluation was performed as follows: First, we randomly selected 20 years from the historical period of 34 years; GCM outputs from this 20-year period were divided into two equal-length parts,
and
, as seen in Equations (5) and (6). Secondly,
was corrected by
and reanalysis data in the corresponding period; the pre-correction and post-correction AMBs against these observations are calculated as follows:
where
xo denotes the values extracted from the reanalysis data obtained in
Section 2.3, and n denotes the sample size in the validation period.
The above steps were repeated 10 times in each 0.1° resolution grid to ensure the performance of the multi-model ensemble forecast. Bias correction was evaluated by the mean value of the results from these 10 iterations to guarantee the robustness of the evaluation [
50], and the AMB was taken using absolute values to prevent the positive and negative from canceling out.
Quantile bias (
QB) is used as a supplementary test to evaluate the ability of bias correction to extreme values. The
QB of the specified percentage
p can be defined as follows:
where
and
are the same as in Equations (5) and (6), and
is estimated by ECDF before and after correction.
- (3)
Evaluation of multi-model uncertainty.
Uncertainty is a key factor affecting the usability of model prediction results; the signal-to-noise ratio (
SNR) was used to quantify the proportion of valid information amid uncertainty.
SNR is defined as follows:
with
where,
means the prediction result of MMA,
denotes the historical simulation result,
n denotes the number of climate models, and
denotes the prediction result of model
i. It indicates that reliability is higher than uncertainty when SNR is greater than 1. Furthermore, because SNR is dimensionless, it can be used to compare different variables.