A Methodological Framework to Retrospectively Obtain Downscaled Precipitation Estimates over the Tibetan Plateau

Long-term precipitation estimates with both finer spatial resolution and better quality are vital and highly needed in various related fields. Numerous downscaling algorithms have been investigated based on the Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA), to obtain precipitation data with finer resolution (~1 km). However, this research was restricted by the time span of the TMPA dataset, as the starting time of TMPA was 1998. In this study, a new methodological framework incorporating wavelet coherence and Cubist was proposed to retrospectively obtain downscaled precipitation estimates (DS) over the Tibetan Plateau (TP), based on TMPA and ground observations, in 1990s. The correlations and similarities of precipitation patterns between the target years, from 1990 to 1999, and reference years, from 2000 to 2013, were firstly determined using wavelet coherence based on ground observations. Following this, the TMPA data in the reference years were regarded as the reference in the corresponding target years, which were adopted to be downscaled using Cubist models and land surface variables, to obtain the DS in the target years. We found that the DS showed continuous trends, which corresponded well with the ground observations. Additionally, the performances of the DS were better than those of the Climate Hazards group Infrared Precipitation with Stations (CHIRPS) data over the TP. Therefore, this methodological framework has great potential for obtaining precipitation estimates for the period of the 1990s for which TMPA data is inaccessible.


Introduction
Precipitation is commonly regarded as one of the most important variables in hydrology, meteorology, climate change, etc. [1].Therefore, long-term precipitation information is in great need in related scientific and application fields.It is a great challenge to monitor and measure precipitation amounts and trends in particular in areas with complex terrain, for instance the Tibetan Plateau (TP).TP is the largest plateau in the world with an average elevation of >4000 m [2], which has also been identified as the pilot region to study climate fluctuations in the world due to its sensitivity to climate change.The temperature in TP has been rising since the 1950s [3], and precipitation shows an increasing trend since the 1990s [4].However, the rain gauge network is significantly scarce over the TP due to its harsh environment and complex terrain [5], which makes it difficult to obtain spatially explicit precipitation information over the plateau.Additionally, strong variability in physiography can lead to localized variations in climatic conditions, needing much more information on the critical climatic variables, i.e., precipitation.Therefore, to investigate the regional climate fluctuations in the TP, long-term precipitation estimates with both finer resolutions (~1 km) and better accuracy are in great need.
Conventional point-based spatial interpolations using rain gauge measurements are used to generate high spatial resolution precipitation datasets.However, complex topography with sparse rain gauges often causes large uncertainties and errors in obtaining precipitation information using spatial interpolations [6].Meanwhile, ground-based radar is spatially limited and practically useless over the TP.Thus, satellite-based precipitation products become the optimum source for providing precipitation estimates with larger spatial coverage.Satellite-based precipitation products, such as the Tropical Rainfall Measurement Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) dataset, were extensively used over the past two decades.The TMPA data provides valuable precipitation information in areas with sparse rain gauges.However, coarse spatial resolution (0.25 • ) of the TMPA data restricts its application at the regional and basin scales [7].To obtain precipitation estimates with finer spatial resolutions, numerous downscaling algorithms combining a suite of environmental variables have been investigated [8][9][10][11].The Normalized Difference Vegetation Index (NDVI) was firstly employed to build exponential relationships with TMPA data [8].Considering the effects of topography, regression kriging to establish predictive models based on the digital elevation model (DEM) was developed by [9].In [10], both the NDVI and DEM were introduced in a multiple liner regression model to downscale TMPA data.Furthermore, [11] proposed a geographically weighted regression (GWR) model using both the NDVI and DEM.Meanwhile, other studies have demonstrated how various land surface characteristics, such as land surface temperature (LST) [12], topographical factors [13,14], influenced the spatial distribution of precipitation.Therefore, Ma et al. [15,16] incorporated a series of land surface variables, including NDVI, DEM, LST and other topographical factors, into a spatial data mining algorithm in order to downscale TMPA data.
Nevertheless, all these downscaling approaches based on TMPA data were restricted by the time span after 1998, due to the availability of TMPA data, thereby missing information from a very important period in investigating climate variability in the TP.For example, precipitation was reported to show an increasing trend in TP since the 1990s.Thus, obtaining precipitation estimates at fine spatial resolutions (~1 km) before the TMPA launch are critical, calling for new approaches.
This study intended to develop a new methodological framework, which incorporated wavelet coherence and Cubist, to retrospectively obtain downscaled precipitation estimates (DS) over the TP, based on TMPA data and ground observations, in the 1990s.Firstly, wavelet coherence is a powerful approach to detect the inherent similarities and correlations of precipitation between the target time periods, from 1990 to 1999, and the reference time periods, from 2000 to 2013, at different temporal scales based on ground observations [17][18][19].Then, the TMPA data in the reference years were regarded as the reference in the corresponding target years, which were adopted to be downscaled using Cubist, a spatial data mining algorithm, to obtain the DS in the target years.The main objectives of this study were: (1) to develop a methodological framework to retrospectively obtain DS in the 1990s over the TP; (2) to evaluate the performance of the DS against ground observations; and (3) to compare the performance of the DS with those of the Climate Hazards group Infrared Precipitation with Stations (CHIRPS) data.

Study Area
The TP, formed after the collisions between the Indian plate and Eurasian continent plate, is the largest and highest plateau in the world, with a mean elevation above 4000 m (Figure 1).Due to its unique topography and location, the TP has complex climatic conditions [20].Two main atmospheric circulation patterns affect the TP [21].For instance, the Indian monsoons prevails in the central and southern parts of the plateau, leading to relatively larger precipitation in these regions (>1000 mm/year) in summer.While in the northern and western parts, which are mainly influenced by westerlies in winter, smaller precipitation occurs (<200 mm/year).Additionally, there is a clear demarcation between dry and wet seasons; about 90% of total annual precipitation occurs during the rainy season (mainly from March to August) [14].

Study Area
The TP, formed after the collisions between the Indian plate and Eurasian continent plate, is the largest and highest plateau in the world, with a mean elevation above 4000 m (Figure 1).Due to its unique topography and location, the TP has complex climatic conditions [20].Two main atmospheric circulation patterns affect the TP [21].For instance, the Indian monsoons prevails in the central and southern parts of the plateau, leading to relatively larger precipitation in these regions (>1000 mm/year) in summer.While in the northern and western parts, which are mainly influenced by westerlies in winter, smaller precipitation occurs (<200 mm/year).Additionally, there is a clear demarcation between dry and wet seasons; about 90% of total annual precipitation occurs during the rainy season (mainly from March to August) [14].

Ground Observations
The annual precipitation records from 1954 to 2013 were provided by the China Meteorological Administration (http://data.cma.cn/).The working meteorological stations available over the TP were sparse and uneven due to the complex terrain and harsh environmental conditions (Figure 1).Most of the stations are located in the south and east parts of TP, and mainly at mid /low elevations.

Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) Dataset
Through the joint efforts of the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA), TRMM was successfully launched in November 1997.TRMM was equipped with instruments including the Visible and Infrared Scanner, TRMM Microwave Imager (TWI) and Precipitation Radar (PR), which was mainly designed to investigate precipitation intensity and distribution in subtropical and tropical regions.The TMPA 3B43 provides monthly precipitation estimates over regions spanning from 50°S to 50°N at the spatial resolution of 0.25° [22].The TMPA 3B43 dataset between 2000 and 2013 were employed in this study, which can be downloaded from https://pmm.nasa.gov/data-access/downloads/trmm.

Ground Observations
The annual precipitation records from 1954 to 2013 were provided by the China Meteorological Administration (http://data.cma.cn/).The working meteorological stations available over the TP were sparse and uneven due to the complex terrain and harsh environmental conditions (Figure 1).Most of the stations are located in the south and east parts of TP, and mainly at mid /low elevations.

Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) Dataset
Through the joint efforts of the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA), TRMM was successfully launched in November 1997.TRMM was equipped with instruments including the Visible and Infrared Scanner, TRMM Microwave Imager (TWI) and Precipitation Radar (PR), which was mainly designed to investigate precipitation intensity and distribution in subtropical and tropical regions.The TMPA 3B43 provides monthly precipitation estimates over regions spanning from 50 • S to 50 • N at the spatial resolution of 0.25 (CCD) rainfall estimates and blends precipitation observations from a variety of sources including national and regional meteorological services [23,24].The spatial resolution of CHIRPS is higher than other satellite-based global precipitation datasets, making it favorable to analyze precipitation variations at finer scales.In this study, monthly data from the CHIRPS database for the period from January 1990 to December 1999 are employed (ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0).2.2.4.Normalized Difference Vegetation Index (NDVI) Dataset NDVI is commonly applied to depict the productivity and distribution of vegetation [25,26].The relationships between NDVI and precipitation were extensively investigated by various researchers [15,27,28].The NDVI datasets were obtained from two satellite-based products: (1) for the period from 1990 to 1999, the dataset was obtained from the Global Inventory Monitoring and Modeling System (GIMMS-NDVI3g).The GIMMS produced the global NDVI dataset using the information collected from the Advanced Very High Resolution Radiometer (AVHRR) sensors.The latest version of GIMMS-NDVI3g (3g.v1) has a spatial resolution of 1/12 • × 1/12 • , which can be found at https: //ecocast.arc.nasa.gov/data/pub/gimms/3g.v1/.(2) For the period from 2000 to 2013, the dataset was collected from Moderate Resolution Imaging Spectroradiometer (MODIS).MOD13A2 is a NDVI dataset at 1 km spatial resolution and was obtained from https://ladsweb.modaps.eosdis.nasa.gov/search/.

Land Surface Temperature (LST) Datasets
Two types of land surface temperature (LST) datasets were used in this study.MODIS provided LST records worldwide with an error controlled between −1 K and 1 K.The MOD11A2 products, which produced LST and Emissivity data at 1 km spatial resolution over every eight days, were used in this study for the period between 2000 and 2013.
LST data from 1990 to 1999 was estimated using the split window method following [29].
where t 4 and t 5 were the radiometric temperatures in channel 4 (10.3 to 11.3 µm) and channel 5 (11.5 to 12.5 µm) of the AVHRR sensor, respectively.w represented the volume of water vapor in the atmosphere, ε was calculated by averaging the emissivity for channels 4 and 5, and ∆ε denoted the differences of emissivities between channels 4 and 5.The estimate errors of LST generated by split window could be controlled at ~±1 K compared to in situ data [30,31].Meanwhile, the LST estimates by split window also showed good agreement with those based on MODIS LST algorithm [32,33].The spatial resolution of the processed LST data using split window was approximately 1.1 km at the annual scale.

Topography Datasets
The Shuttle Radar Topography Mission (SRTM) project, cooperated between NASA and the National Geospatial-Intelligence Agency (NGA), was successfully launched in February 2000.The SRTM project provided high-resolution DEM data with a spatial coverage ranging from 56 • S to 60 • N. The 3 arc-second (approximately 90 m) grid DEM was obtained from http://srtm.csi.cgiar.organd used in this study.

Wavelet Coherence
Wavelet coherence is a useful approach for determining the correlations and similarities of patterns in two time series at various scales.For two independent time series, n and m, the wavelet transform was defined as W n t (s) and W m t (s).Then the cross-wavelet spectrum of n and m time series was calculated as: In Equation (2), * denoted the complex conjugate, t and s represented the time index and scale, respectively.The definition of wavelet coherence could be interpreted as the absolute value squared of the smoothed W nm t (s) [34]: where S was a smoothing operator.To determine the 95% significance level of the wavelet coherence results, the Monte Carlo method was implemented.More detailed descriptions and applications about the wavelet coherence could be found in [35,36].A MATLAB software package for performing the wavelet coherence can be found at http://grinsted.github.io/wavelet-coherence/.Two indices, including mean wavelet coherence (MWC) and percent area of significant coherence (PASC), were introduced to evaluate the performances of the wavelet coherence [37,38].MWC and PASC denoted the mean value of the wavelet coherence and the proportion of significant coherence area to the whole wavelet domain, at all scales and periods, respectively.The larger MWC and PASC indicated that the patterns in two time series are more correlated in nature.

Cubist
In this study, Cubist was implemented to construct the predictive models between precipitation and a suite of land surface variables, including NDVI, LST and topographical indices.The Cubist algorithm has two main characteristics: partitioning and model building.By implementing the Cubist method, the entire dataset was divided into several subsets which were grouped according to their geographic similarities.For each separate subset, the rule-based model was constructed by applying a stepwise regression strategy.More details on the Cubist method can be found elsewhere, such as [15,16].In this study, the Cubist model was generated using the package Cubist in R (https: //cran.r-project.org/web/packages/Cubist/).

Mann-Kendall Trend Test
The sequential Mann-Kendall (SQ-MK) test was used in this study to determine the approximate start point of a significant trend [39,40].Meanwhile, this test could also reflect the fluctuations in the trend [41,42].The general trend and mutation time in a time series were estimated as follows.
For collected time-sequencing precipitation data p 1 , p 2 , p 3 , . . ., p t , the time-ordered series (S n ) according to their collection sequence was given as follows: in which, where p i and p j were the precipitation amounts at times i and j, and the progressive statistic (UF (S n ) ) was defined as: E(S n ) and Var(S n ) denoted the mean and variance of S n , which were calculated as Equations ( 7) and (8), respectively.7) UF (S n ) was the forward sequence and all UF (S n ) would result in a curve (UF) which started at 0 (when n = 1).The similar analysis was then employed in the reverse data series.By applying this method, a retrograde UB (S n ) could be computed by: For all UB (S n ) , a curve was designed as UB.S n could be considered as normal distribution under the null hypothesis.For this study the confidence interval was taken as ±1.96 (p = 0.05), which indicated that if the absolute value of UF was larger than 1.96, the trend was considered to be significant at the 95% confidence level.The negative UF reflected a decreasing trend while the positive value of UF represented an increasing trend in the time series.The intersection of UF and UB curves represented the approximate mutation time of the trend within the precipitation time series.

Calibration and Validation
As the DS are an indirect precipitation estimates in nature, the regional bias and random errors from the estimates should be evaluated and corrected [43].Thus, a fraction of the rain gauges were chosen to calibrate the estimates while the remaining stations were used to independently evaluate the performance of the estimates (Figure 1).The gauges for calibration were selected according to their locations and recorded precipitation amounts.Four comparison indices were applied in the evaluation, including coefficients of determination (R 2 ), bias, root mean square error (RMSE) and Nash-Sutcliffe efficiency (NSE).
P i was the precipitation estimate; G i and G were the annual and mean annual values based on ground observations, respectively; i and n were the index and total number of comparison pairs, respectively.The R 2 reflects the strength of the liner relationship between the precipitation estimates and ground observations, ranging between 0 and 1 with an optimal value of 1.The bias measures the tendency of the estimates to be overestimated (positive values) or underestimated (negative values), with a perfect value of 0. The RMSE indicates the magnitude of error estimations, the smaller RMSE means more reliable estimates compared to ground observations.The NSE [44], which varies from −∞ to 1, measures the magnitude of the variance of the residuals compared to those of the observed values of precipitation, and the perfect value of NSE is 1 while being negative indicates poor precipitation estimations.

Main Steps to Retrospectively Obtain Precipitation Estimates (~1 km) in the 1990s
The main steps were described as follows: (1) Wavelet coherence was firstly applied to detect the inherent similarities and correlations of precipitation between the target time periods, from 1990 to 1999, and reference time periods, from 2000 to 2013, at different temporal scales based on ground observations (Figure 2).The target year and the corresponding reference year were determined when the MWC and PASC values were largest.(2) All land surface characteristics, including annual mean LST data, annual mean NDVI and topographical parameters, were aggregated to 0.25 • from corresponding data at their original spatial resolutions, from 2000 to 2013.Then, the Cubist models were built between TMPA data and land surface variables in the reference years at a spatial resolution of 0.25 • .(3) In the target years from 1990 to 1999, the land surface variables at ~1 km were firstly obtained.In terms of NDVI, we interpolated the GIMMS NDVI (1/12 • ) into those at ~1 km using simple spline tension interpolator, which was typically suitable for regularly-spaced data [7], in this study.The gridded GIMMS NDVI data had been converted into those in point-based format, before the interpolations were conducted in ArcGIS 10.2 software (https://www.esri.com/en-us/home).The DS were obtained by applying the Cubist models generated in step (2), in the corresponding reference years determined in step (1), on the land surface variables in the target years (Figure 2).( 4) The calibration data was used to correct the DS, in the target years, obtained in step (3).At the beginning, the point-based ratios, by comparing the ground observations to DS, were interpolated into gridded estimates (~1 km) using the ordinary kriging technique [13].Moreover, the final DS with gauge calibrations were obtained by multiplying the gridded ratios by the DS without gauge calibration obtained in step ( 3). ( 5) The performances of DS at ~1 km resolution were assessed through validation stations.
Meanwhile, the performance of CHIRPS was also evaluated through the same validation stations and compared with those of the DS with/without gauge calibration (Figure 2).

The Trends and Mutation of Precipitation over the Tibetan Plateau (TP)
Figure 3 presented the statistical curves of the  and , using the SQ-MK test based on

The Trends and Mutation of Precipitation over the Tibetan Plateau (TP)
Figure 3 presented the statistical curves of the UF and UB, using the SQ-MK test based on ground observations from rain gauges, in the period from 1954 to 2013 over the TP.The UF and UB curves were intersected in 1984, moreover, the intersection was inside the confidence interval, which indicated that the precipitation patterns over the TP were changed in 1984.The UF curve represented the general trend of the precipitation.According to the UF curve, the temporal variations in precipitation over the TP from 1954 to 2013 could be approximately divided into three periods.The variations in precipitation fluctuated in the first period between 1954 and the late 1960s.The second period was between 1970 and the early 1980s, in which the precipitation variations exhibited a decreasing trend.In the third period (after 1984), the precipitation variations showed a steep increasing trend, and the increasing trend was considered to be significant since 1990, because the UF curve exceeded the line of 95% confidence level.Therefore, this study focused on the period from 1990 to 1999, with the significant increasing trend of precipitation over the TP.

Inter-Annual Correlations and Similarities of Precipitation Patterns
The correlations and similarities of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013, were shown in Figure 4.According to the general trend of the variability, the temporal scales were divided into four groups: very short-period (<8 days), short-period (8-32 days), medium-period (32-64 days) and long-period (>64 days).At the very short-period and short-period temporal scales, the correlations of precipitation patterns in the target year and those in the reference years were not strong, although several significant correlations were observed in some discrete time.In the medium-period scale, the precipitation patterns in 1990 and those in 2006 showed significant correlations from the mid to the end of the year.Furthermore, the statistical correlations of precipitation patterns, in 1990 and those in 2006, were much more significant at the long period scale than those at the medium-period scale (Figure 4).Thus, the precipitation patterns in the reference year, 2006, was considered as the optimal reference to stand for those in 1990.

Inter-Annual Correlations and Similarities of Precipitation Patterns
The correlations and similarities of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013, were shown in Figure 4.According to the general trend of the variability, the temporal scales were divided into four groups: very short-period (<8 days), short-period (8-32 days), medium-period (32-64 days) and long-period (>64 days).At the very short-period and short-period temporal scales, the correlations of precipitation patterns in the target year and those in the reference years were not strong, although several significant correlations were observed in some discrete time.In the medium-period scale, the precipitation patterns in 1990 and those in 2006 showed significant correlations from the mid to the end of the year.Furthermore, the statistical correlations of precipitation patterns, in 1990 and those in 2006, were much more significant at the long period scale than those at the medium-period scale (Figure 4).Thus, the precipitation patterns in the reference year, 2006, was considered as the optimal reference to stand for those in 1990.The precipitation patterns in two years showed similar characteristics and trends throughout the entire year, such as the alternations of dry and wet seasons, and the precipitation peaks, and valleys in specific periods.The relative differences in precipitation between 1990 and 2006 were displayed in Figure 5b.Generally, the differences were almost randomly distributed, at daily scale, year-round, which indicated the precipitation differences did not have any specific patterns.Therefore, wavelet coherence method could determine the inherent similarities between two precipitation patterns in target years and those in reference years, and reveal strong correlations.Figure 5a displayed the curves of the precipitation variations in 1990 (blue line) and those in 2006 (red line), based on the mean values of ground observations over the TP.The precipitation patterns in two years showed similar characteristics and trends throughout the entire year, such as the alternations of dry and wet seasons, and the precipitation peaks, and valleys in specific periods.The relative differences in precipitation between 1990 and 2006 were displayed in Figure 5b.Generally, the differences were almost randomly distributed, at daily scale, year-round, which indicated the precipitation differences did not have any specific patterns.Therefore, wavelet coherence method could determine the inherent similarities between two precipitation patterns in target years and those in reference years, and reveal strong correlations.Hereafter, the correlations and similarities of precipitation patterns in each target year, from 1991 to 1999, and the corresponding reference year, from 2000 to 2013, were matched.The wavelet coherence results based on the correlations of precipitation patterns in the target years from 1991 to 1999 and its corresponding reference years were shown in Figure 6.Similar to the situation in 1990, there were also limited correlations of precipitation patterns detected in discrete period at a very short period scale (<8 day) and short period scale (8-16 day), respectively.However, the strong correlations were always determined at long period temporal scale (>64 days), indicating the precipitation patterns in two independent years were correlated well at the bi-monthly scale.Hereafter, the correlations and similarities of precipitation patterns in each target year, from 1991 to 1999, and the corresponding reference year, from 2000 to 2013, were matched.The wavelet coherence results based on the correlations of precipitation patterns in the target years from 1991 to 1999 and its corresponding reference years were shown in Figure 6.Similar to the situation in 1990, there were also limited correlations of precipitation patterns detected in discrete period at a very short period scale (<8 day) and short period scale (8-16 day), respectively.However, the strong correlations were always determined at long period temporal scale (>64 days), indicating the precipitation patterns in two independent years were correlated well at the bi-monthly scale.Table 2 displayed the quantitative evaluations of the correlations of precipitation patterns in the target years, from 1991 to 1999, and the reference years from 2000 to 2013, at different temporal scales.In accordance with the wavelet coherence results presented in Figure 6, the strongest correlations of precipitation patterns in the target years and the corresponding reference years were detected at long period (>64 day).Additionally, the values of MWC and PASC at long period temporal scale (>64 day) were larger than those at shorter temporal scales, in terms of MWC values Table 2 displayed the quantitative evaluations of the correlations of precipitation patterns in the target years, from 1991 to 1999, and the reference years from 2000 to 2013, at different temporal scales.In accordance with the wavelet coherence results presented in Figure 6, the strongest correlations of precipitation patterns in the target years and the corresponding reference years were detected at long period (>64 day).Additionally, the values of MWC and PASC at long period temporal scale (>64 day) were larger than those at shorter temporal scales, in terms of MWC values (>0.79) and PASC values (>39.16%).Generally speaking, the mean values of MWC and PASC, at all scales, were larger than 0.42 and 19.58%, respectively.

Retrospectively Downscaled Results and Validations
Based on wavelet coherence and the Cubist algorithm, we retrospectively obtained the DS over the TP for the period from 1990 to 1999, at ~1 km resolution.In addition, CHIRPS data from 1990 to 1999 was also introduced and evaluated, in order to compare the performances with those of DS.The spatial pattern of CHIRPS data, determined by averaging precipitation estimates in the years from 1990 to 1999, was shown in Figure 7a, while the spatial patterns of the DS without gauge calibration and those with gauge calibration were shown in Figure 7b,c, respectively.Both CHIRPS data and the DS data showed a similar trend of precipitation over the TP.However, it was obvious that CHIRPS data presented relatively larger precipitation estimates in a small part of the south-western TP with annual precipitation around 800-1000 mm/year, while the annual precipitation of the DS was under 600 mm/year.Due to the coarse resolution of CHIRPS data (~5 km), the spatial pattern of CHIRPS demonstrated significant spatial divisions.The spatial trends of DS varied gradually from south-east to north-west, over the TP, because of the finer spatial resolution (~1 km).
To compare the performances of the DS, both without and with rain gauge calibration, with those of CHIRPS data, we validated these precipitation products against ground observations for the period from 1990 to 1999.The validation of CHIRPS data was shown in Figure 8a, while Figure 8b,c displayed the validations of the DS without rain gauge calibration and those with rain gauge calibration, respectively.Generally speaking, CHIRPS data had the worst performance (R 2 ~0.53, bias ~5.58%, NSE ~0.49 and RMSE ~202.84 mm) among the three precipitation products.The DS without gauge calibration outperformed CHIRPS data, with larger correlations and lower errors (R 2 ~0.75, NSE 0.57, and RMSE ~173.28 mm).The DS with gauge calibration improved the accuracy (R 2 ~0.80, bias 10.03%, NSE ~0.71 and RMSE ~143.06 mm), compared to those without gauge calibration.Table 3 displays the validation results of CHIRPS data, the DS both with and without rain gauge calibration, against ground observations, in the years from 1990 to 1999, over the TP.The CHIRPS data performed relatively worse, with smaller correlations and larger errors (R 2 varying from 0.41 to 0.62, NSE ranging from 0.26 to 0.62, and RMSE fluctuating between 162.42 mm and 255.12 mm).Compared with CHIRPS data, the DS without gauge calibration showed better performances (R 2 > 0.70, NSE > 0.49).The accuracy of the DS with gauge calibration outperformed CHIRPS and DS without gauge calibration in the years from 1990 to 1999 (R 2 ~0.80,NSE ~0.80, and RMSE ~120 mm).
CHIRPS data and the DS data showed a similar trend of precipitation over the TP.However, it was obvious that CHIRPS data presented relatively larger precipitation estimates in a small part of the south-western TP with annual precipitation around 800-1000 mm/year, while the annual precipitation of the DS was under 600 mm/year.Due to the coarse resolution of CHIRPS data (~5 km), the spatial pattern of CHIRPS demonstrated significant spatial divisions.The spatial trends of DS varied gradually from south-east to north-west, over the TP, because of the finer spatial resolution (~1 km).To compare the performances of the DS, both without and with rain gauge calibration, with those of CHIRPS data, we validated these precipitation products against ground observations for the period from 1990 to 1999.The validation of CHIRPS data was shown in Figure 8a, while Figures 8b and 8c    Figure 9a displayed the spatial distributions of mean annual precipitation obtained from CHIRPS, DS without gauge calibration and DS with gauge calibration, while Figure 9b-e demonstrated spatial distributions of R 2 , bias, RMSE and NSE, respectively, of CHIRPS data, the DS without gauge calibration and DS with gauge calibration, against ground observations, from 1990 to 1999.Generally, the spatial distributions of annual precipitation at the stations over the TP were captured well by CHIRPS, DS both without and with gauge calibration.In terms of the evaluation indices, all the precipitation products showed better agreements to ground observations in the southern and eastern TP where precipitation was relatively larger, but had poor performances in western and northern TP, where precipitation was relatively smaller.The DS with gauge calibration showed the best performance considering the strong correlations between precipitation estimates and ground observations with R 2 values even >0.8 in most sites, while the R 2 values of CHIRPS data were acceptable (R 2 > 0.6) in the south-eastern TP, but decreased significantly in the north and south-western parts (R 2 ~0.2).The bias and RMSE values of the DS with calibration also declined, especially over regions close to the Himalayan Mountains, compared with those of CHIRPS data.The NSE values of DS with gauge calibration also improved.These results demonstrated that the DS with gauge calibration provided more reliable precipitation estimates than CHIRPS data and DS without gauge calibration, over the TP.

Comparisons of Precipitation Estimates at Specific Stations
Based on the evaluation results presented in Figure 9, we found that precipitation estimates based on CHIRPS and DS had significant differences over TP, for example, in region A and region B, which could be found in Figure 1.There were four validation stations located in region A, namely Gaize, Pulan, Lazi and Nielaer, with elevation >3800 m above sea level.The mean annual precipitation were 156.5 mm/year, 131.4 mm/year, 362.1 mm/year and 551.8 mm/year, at Gaize, Pulan, Lazi and Nielaer stations, respectively, in region A. Another four validation stations, including Xiaozaohuo, Delingha, GeErMu and NuoMuHong, were distributed in region B. By contrast to region A, which is a mountainous area, region B is a basin area with elevation ~2800 m.The mean annual precipitation recorded by Xiaozaohuo, Delingha, GeErMu and NuoMuHong were 29.2 mm/year, 173.0 mm/year, 51.8 mm/year and 43.3 mm/year, respectively.
Figure 10 illustrates the comparisons of annul precipitation at the aforementioned eight stations based on ground observations, and precipitation estimates obtained from CHIRPS and the DS.The precipitation at the four stations, located in region A were poorly captured by CHIRPS data.It could be seen that CHIRPS data tended to greatly overestimate precipitation at each station in region A, which suggested that CHIRPS data could not capture the annual precipitation volumes appropriately, and provided unfavorable precipitation information in region A. The precipitation estimates obtained from DS were in better agreement with ground observations at the Gaize and Nielaer stations, compared with those of CHIRPS data.While in region B, both the CHIRPS and the DS had reasonable performances at Delingha station, however, CHIRPS data highly overestimated precipitation volumes in 1996 and 1997.Additionally, the DS captured annual precipitation well at Delingha station, such as the precipitation peak in 1992 and 1998, which indicated the better accuracy of the DS.Overall, the DS outperformed CHIRPS data in regions A and B at the specific stations.DS had reasonable performances at Delingha station, however, CHIRPS data highly overestimated precipitation volumes in 1996 and 1997.Additionally, the DS captured annual precipitation well at Delingha station, such as the precipitation peak in 1992 and 1998, which indicated the better accuracy of the DS.Overall, the DS outperformed CHIRPS data in regions A and B at the specific stations.

Improvements and Limations of the Framework
In this study, we tried to obtain precipitation estimates (~1 km) from 1990 to 1999 over the TP retrospectively.The precipitation estimates, obtained by the proposed framework, had reasonable performances compared with those of CHIRPS data, against ground observations.Wavelet coherence demonstrated its abilities to reveal the correlations and similarities of precipitation patterns in two independent years.Meanwhile, Cubist is a newly capable spatial data mining algorithm used to downscale coarse satellite-based precipitation data [14,15].The Cubist algorithm considered the non-stationary relationships between precipitation and land surface variables.Meanwhile, Cubist also performed robustly when considering a large number of predictor variables.Additionally, the liner models generated by Cubist were interpretable [14], while other machine learning algorithms, such as random forest and artificial neutral networks, do not have this ability.Therefore, this methodological framework, combining wavelet coherence and Cubist, had great potentials in obtaining precipitation estimates for the period, in 1990s, when TMPA data is inaccessible.
However, the accuracy of the DS generated by this framework was also subjected to several error sources.Firstly, the correlations and similarities of precipitation patterns between the target years, from 1990 to 1999, and the optimal reference years, from 2000 to 2013, were determined.But the period of the reference years may be not long enough to detect the 'best' year, in which the TMPA and Cubist models were applied retrospectively to obtain the DS.For example, the precipitation patterns in 2006 were selected as the reference to obtain precipitation estimates for the target years both in 1990 and 1997.Additionally, to obtain precipitation estimates at ~1 km resolution, land surface variables with both finer resolutions (~1 km) and better accuracy were also needed in 1990s.For example, there were limited NDVI and LST datasets extending to the 1990s.In this study, GIMMS NDVI 3g data with a spatial resolution of 1/12 • , were interpolated to obtain the ~1 km NDVI data, which may result in unavoidable uncertainties in the DS.Besides, the selection of calibration data may also influence the accuracy of the precipitation estimates [45,46].

Possible Applications of the Retrospectively Downscaled Results in Related Fields
Long-term precipitation estimates with both finer spatial resolutions (~1 km) and better accuracy are urgently needed in various related fields, such as soil science, ecology and water balance research [47][48][49][50].The DS obtained in this study could be used in predicting soil properties, like soil organic carbon [47,48], which was abundantly stored in the permafrost in TP and was sensitive to climate change.Although the TP provides water resources supporting enormous ecosystems in China and South-east Asia and >1.4 billion people downstream [49], and different climate change patterns were significantly detected by previous research, long-term precipitation estimates with both finer spatial resolutions and better accuracy are still needed [51], for example, in investigating the variation patterns of precipitation and glaciers.

Conclusions
In this study, a new methodological framework, combining wavelet coherence and Cubist, was proposed to obtain retrospectively the DS from 1990 to 1999 over the TP.Meanwhile, a calibration procedure was also incorporated.The performance of the DS was evaluated using ground observations.The major findings of this study were as follows: (1) the mutation in precipitation patterns was detected in 1984, over the TP, and a significant increasing trend in precipitation was found after 1990; (2) the DS with gauge calibration showed continuous trends over the TP from south-east to north-west; (3) the DS with gauge calibration were in better agreement with ground observations compared with those of CHIRPS data.Therefore, the methodological framework developed in this study could serve as an alternative approach to obtain downscaled results in 1990s retrospectively.

Figure 1 .
Figure 1.Distributions of rain gauges for validation and calibration over the Tibetan Plateau (TP).

Figure 1 .
Figure 1.Distributions of rain gauges for validation and calibration over the Tibetan Plateau (TP).

21 Figure 2 .
Figure 2. Flow chart for obtaining retrospective precipitation estimates at ~1 km, in 1990s, using wavelet coherence and Cubist.

Figure 2 .
Figure 2. Flow chart for obtaining retrospective precipitation estimates at ~1 km, in 1990s, using wavelet coherence and Cubist.

21 Figure 3 .
Figure 3. Mann-Kendall test of mean annual precipitation based on rain gauges over the TP from 1954 to 2013.

Figure 3 .
Figure 3. Mann-Kendall test of mean annual precipitation based on rain gauges over the TP from 1954 to 2013.

Figure 4 .
Figure 4.The correlations and similarities of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013.The X-axis indicated the period along the whole year; the Y-axis indicated the temporal scales; the 95% confidence levels were shown as the thick solid lines; the color bar indicated the strength of correlation.The quantitative evaluations of the correlations of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013, at different temporal scales, were displayed in Table 1.The largest values of MWC (~0.70) and PASC (~40.00%) were determined in results, based on the correlations of precipitation patterns in 1990 and those in 2006, at a long period (>64 days), which coincided with the wavelet coherence results displayed in Figure 4.The strength of the correlations of precipitation patterns in 1990 and 2006 outperformed than those in other reference years, considering the largest mean values of MWC (0.49) and PASC (27.02%).

Figure 4 .
Figure 4.The correlations and similarities of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013.The X-axis indicated the period along the whole year; the Y-axis indicated the temporal scales; the 95% confidence levels were shown as the thick solid lines; the color bar indicated the strength of correlation.The quantitative evaluations of the correlations of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013, at different temporal scales, were displayed in Table 1.The largest values of MWC (~0.70) and PASC (~40.00%) were determined in the results, based on the correlations of precipitation patterns in 1990 and those in 2006, at a long period (>64 days), which coincided with the wavelet coherence results displayed in Figure 4.The strength of the correlations of precipitation patterns in 1990 and 2006 outperformed than those in other reference years, considering the largest mean values of MWC (0.49) and PASC (27.02%).

Figure 5a displayed the
Figure5adisplayed the curves of the precipitation variations in 1990 (blue line) and those in 2006 (red line), based on the mean values of ground observations over the TP.The precipitation patterns in two years showed similar characteristics and trends throughout the entire year, such as the alternations of dry and wet seasons, and the precipitation peaks, and valleys in specific periods.The relative differences in precipitation between 1990 and 2006 were displayed in Figure5b.Generally, the differences were almost randomly distributed, at daily scale, year-round, which indicated the precipitation differences did not have any specific patterns.Therefore, wavelet coherence method could determine the inherent similarities between two precipitation patterns in target years and those in reference years, and reveal strong correlations.

Figure 5 .
Figure 5.The line chart of (a) precipitation variations in 1990 (blue lines) and those variations in 2006 (red lines); and (b) the relative differences of precipitation in 1990 and 2006.

Figure 5 .
Figure 5.The line chart of (a) precipitation variations in 1990 (blue lines) and those variations in 2006 (red lines); and (b) the relative differences of precipitation in 1990 and 2006.

21 Figure 6 .
Figure 6.The correlations and similarities of precipitation patterns in the target years, from 1991 to 1999, and reference years from 2000 to 2013.The X-axis indicated the period along the whole year; the Y-axis indicated the temporal scales in day; the 95% confidence levels were shown as the thick solid lines; the color bar indicated strength of correlation.

Figure 6 .
Figure 6.The correlations and similarities of precipitation patterns in the target years, from 1991 to 1999, and reference years from 2000 to 2013.The X-axis indicated the period along the whole year; the Y-axis indicated the temporal scales in day; the 95% confidence levels were shown as the thick solid lines; the color bar indicated strength of correlation.

Figure 7 .
Figure 7. Spatial patterns of mean annual precipitation over the TP from 1990 to 1999 derived from (a) Climate Hazards group Infrared Precipitation with Stations (CHIRPS) at ~5 km resolution, (b) downscaled precipitation estimates (DS) without rain gauge calibration at ~1 km resolution, (c) DS with rain gauge calibration at ~1 km resolution.
displayed the validations of the DS without rain gauge calibration and those with rain gauge calibration, respectively.Generally speaking, CHIRPS data had the worst performance (R 2 ~ 0.53, bias ~5.58%, NSE ~0.49 and RMSE ~202.84 mm) among the three precipitation products.The DS without gauge calibration outperformed CHIRPS data, with larger correlations and lower errors (R 2 ~0.75, NSE ~0.57, and RMSE ~173.28 mm).The DS with gauge calibration improved the accuracy (R 2 ~0.80, bias ~10.03%,NSE ~0.71 and RMSE ~143.06 mm), compared to those without gauge calibration.

Figure 7 .Figure 8 .Figure 8 .
Figure 7. Spatial patterns of mean annual precipitation over the TP from 1990 to 1999 derived from (a) Climate Hazards group Infrared Precipitation with Stations (CHIRPS) at ~5 km resolution, (b) downscaled precipitation estimates (DS) without rain gauge calibration at ~1 km resolution, (c) DS with rain gauge calibration at ~1 km resolution.Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 21

Figure 9 .Figure 9 .
Figure 9. Spatial distributions of (a) mean annual precipitation obtained from CHIRPS (left column), DS without gauge calibration (the central column) and DS with gauge calibration (right column), (b) Figure 9. Spatial distributions of (a) mean annual precipitation obtained from CHIRPS (left column), DS without gauge calibration (the central column) and DS with gauge calibration (right column), (b) R 2 , (c) bias, (d) RMSE, (e) NSE, of CHIRPS, DS without gauge calibration and DS with gauge calibration, against ground observations, from 1990 to 1999, over the TP.

Figure 10 .
Figure 10.Mean annual precipitation volumes based on ground observations, and precipitation estimates obtained from CHIRPS and DS, from 1990 to 1999, at specific stations in both regions A and B.

Figure 10 .
Figure 10.Mean annual precipitation volumes based on ground observations, and precipitation estimates obtained from CHIRPS and DS, from 1990 to 1999, at specific stations in both regions A and B.

Table 1 .
Statistical results of wavelet coherence based on correlations and similarities of precipitation patterns in the target year, 1990, and reference years from 2000 to 2013.

Table 2 .
Statistical results of wavelet coherence based on correlations and similarities of precipitation patterns in the target years, from 1991 to 1999, and reference years from 2000 to 2013.

Table 3 .
Statistical results of the CHIRPS data, the DS without and with gauge calibration, against ground observations in the years from 1990 to 1999.