Downscaling Precipitation in the Data-Scarce Inland River Basin of Northwest China Based on Earth System Data Products

Precipitation is a key climatic variable that connects the processes of atmosphere and land surface, and it plays a leading role in the water cycle. However, the vast area of Northwest China, its complex geographical environment, and its scarce observation data make it difficult to deeply understand the temporal and spatial variation of precipitation. This paper establishes a statistical downscaling model to downscale the monthly precipitation in the inland river basin of Northwest China with the Tarim River Basin (TRB) as a typical representation. This method combines polynomial regression and machine learning, and it uses the batch gradient descent (BGD) algorithm to train the regression model. We downscale the monthly precipitation and obtain a dataset from January 2001 to December 2017 with a spatial resolution of 1 km × 1 km. The results show that the downscaling model presents a good performance in precipitation simulation with a high resolution, and it is more effective than ordinary polynomial regression. We also investigate the temporal and spatial variations of precipitation in the TRB based on the downscaling dataset. Analyses illustrate that the annual precipitation in the southern foothills of the Tianshan Mountains and the North Kunlun Mountains showed a significant upward trend during the study periods, while the annual precipitation in the central plains presented a significant downward trend.


Introduction
Precipitation is a crucial part of the water and energy cycle [1,2]. As an important indicator of climate change and hydrological processes, changes in precipitation will inevitably affect the distribution of water resources in the river basins [3,4]. The variations of precipitation and its impacts on ecosystems have gradually become crucial issues in hydrology and ecology, especially in Northwest China [5]. The ecosystems based on the supply of precipitation and meltwater in the arid regions of Northwest China are more vulnerable under global warming [6][7][8]. Northwest China is rich in natural resources with an extremely vulnerable ecological environment due to its limited water resources [6,7]. Under the background of global warming, the climate change in Northwest China has presented a warm-wet trend during the past decades [4,6,9]. The significant increases in precipitation are mainly due to the increases in summer precipitation. Precipitation has a significant effect on regional climate change [2,10,11]. Consequently, it is critical to understand the temporal and spatial variation of precipitation.
However, traditional precipitation data usually come from meteorological stations and rely heavily on observation data [12]. A complex geographical environment, limited meteorological stations, and sparse observation data have made it hard to further investigate the variation of precipitation in Northwest China [2,12]. High mountains and basins widely distribute in Northwest China, which constitutes this particular natural unit with characteristics of the three significant ecosystems of mountain, oasis, and desert [13]. Its vast area, complicated geographical environment, and rare observation data make it difficult for people to deeply understand the temporal and spatial variation of precipitation of the data-scarce inland river basins in Northwest China. For the data-scarce inland river basins in Northwest China with complicated terrain and sparse meteorological stations, it is convenient to obtain precipitation data from Earth system data products [12]. The Tropical Rainfall Measuring Mission (TRMM)-a joint mission of NASA (National Aeronautics and Space Administration) and JAXA (Japanese Aerospace Exploration Agency) which was launched in 1997 to study precipitation and climate change-has good research potential [14,15]. TRMM Multi-Satellite Precipitation Analysis (TMPA, 3B43 V.7) is a new dataset that can accurately estimate precipitation distribution. The TMPA data could adequately capture the change of precipitation at the regional scale and have good correlation with the rain gauge data in China [16]. Compared with other versions of the dataset, TMPA performs better in the arid regions of Northwest China. Its long-term average annual precipitation produces a reasonable spatial distribution, and its average monthly precipitation trend is reasonable [17][18][19][20][21]. However, when applied to hydrological research in river basins, TMPA data still require downscaling to achieve a high resolution.
Downscaling methods are the universal solutions to address the problems of a low resolution. Conventional methods include linear regression, polynomial regression, and other machine learning methods [22][23][24][25][26]. Previous studies have established the relationships between precipitation, the normalized difference vegetation index (NDVI), and topography information by linear regression or polynomial regression to achieve downscaling precipitation [18,[25][26][27][28]. However, the relationships between precipitation and topography information are not just linear, especially in Northwest China [12,25,29]. The downscaling results of Heihe river basin of Northwest China suggested that the downscaling simulation of quadratic polynomial regression is closer to the observed precipitation [25]. However, the downscaling methods of polynomial regression are based on normal equation methods, such as the least-squares method. The least-squares method is sensitive to noise and suitable for relatively small samples [30,31]. Gradient descent algorithms are often used as the core methods of training algorithms in the field of machine learning, and they is commonly used to recursively approximate a minimum deviation model, such as regression and artificial neural networks [32][33][34][35][36][37]. The batch gradient descent (BGD) algorithm is a conventional method of gradient descent that is widely used in the field of machine learning [35,[38][39][40]. This algorithm can move towards the minimum value of cost function, and it is relatively stable, less affected by noise, and suitable for large samples [41,42].
In order to investigate the temporal and spatial variation of precipitation in the data-scarce river basin of Northwest China, this paper establishes a statistical downscaling method based on polynomial regression and BGD algorithm to simulate the spatial variation of monthly precipitation with a high resolution (1 km × 1 km). We selected the Tarim River Basin (TRB) as a typical representation of a data-scarce river basin in Northwest China. Based on the TMPA, NDVI, and DEM (Digital Elevation Model) data, we established a quadratic polynomial regression model of precipitation, DEM, the NDVI, aspect, and slope. We trained the regression model through the BGD algorithm (BGD-based polynomial regression) to downscale the monthly precipitation with a high spatial resolution (1 km × 1 km) from January 2001 to December 2017 and investigate the change in precipitation. Moreover, we compared the results of this statistical downscaling model and ordinary polynomial regression with the observation data and TMPA, respectively. This study improves the previous downscaling methods based on Atmosphere 2019, 10, 613 3 of 20 polynomial regression and enhances the downscaling accuracy, which will be helpful for future research about climate change and the downscaling of Northwest China.

Description of the Study Area
The TRB is the most vast inland river basin in Northwest China (Figure 1), with a total area of about 1.02 × 10 6 km 2 [13,43]. Due to extreme climate drought and limited water resources, the main source of water in the river basin is glacial snowmelt and precipitation [1,44]. The TRB is a closed river basin with unique water cycle processes which has a large temporal and spatial differences in precipitation distribution [2,45]. It is surrounded by high mountains, like the Tianshan Mountains, the Pamirs, the Karakorum Mountains, the Kunlun Mountains, and the Altun Mountains [46]. The middle region distributes the Taklimakan Desert. The complicated geographical environments of the TRB generate a closed basin with a typical continental climate [47].

Description of the Study Area
The TRB is the most vast inland river basin in Northwest China (Figure 1), with a total area of about 1.02 × 10 6 km 2 [13,43]. Due to extreme climate drought and limited water resources, the main source of water in the river basin is glacial snowmelt and precipitation [1,44]. The TRB is a closed river basin with unique water cycle processes which has a large temporal and spatial differences in precipitation distribution [2,45]. It is surrounded by high mountains, like the Tianshan Mountains, the Pamirs, the Karakorum Mountains, the Kunlun Mountains, and the Altun Mountains [46]. The middle region distributes the Taklimakan Desert. The complicated geographical environments of the TRB generate a closed basin with a typical continental climate [47].

Data Source
This paper mainly constructed the downscaling model of precipitation based on Earth system data products (topography, vegetation, and precipitation data), and it verified the downscaling simulation results with observation data. We used the TRMM Multi-Satellite Precipitation Analysis data (TMPA, 3B43 V.7) from January 2001 to December 2018 with a spatial resolution of 0.25° × 0.25°. Its units are in mm/hr, which we needed to convert to mm/month. (https://pmm.nasa.gov/TRMM). The NDVI data came from the MOD13A3 (Moderate-Resolution Imaging Spectroradiometer, Product ID: MOD13A3) NDVI monthly data with a spatial resolution of 1 km × 1 km and were provided by NASA Terra satellite. The time range of the NDVI was from January 2001 to December 2018 (https://lpdaac.usgs.gov/). The data, such as radiation correction, atmospheric correction, and geometric correction, were preprocessed by the data provider. To easily reflect the vegetation changes and process data, we needed to convert the projection to the World Geodetic System 1984 and splice the data from subregions by the row number. This research used the Global Bathymetry and Elevation Data at 30 Arc Seconds Resolution of SRTM (Shuttle Radar Topography Mission) DEM

Data Source
This paper mainly constructed the downscaling model of precipitation based on Earth system data products (topography, vegetation, and precipitation data), and it verified the downscaling simulation results with observation data. We used the TRMM Multi-Satellite Precipitation Analysis data (TMPA, 3B43 V.7) from January 2001 to December 2018 with a spatial resolution of 0.25 • × 0.25 • . Its units are in mm/hr, which we needed to convert to mm/month. (https://pmm.nasa.gov/TRMM). The NDVI data came from the MOD13A3 (Moderate-Resolution Imaging Spectroradiometer, Product ID: MOD13A3) NDVI monthly data with a spatial resolution of 1 km × 1 km and were provided by NASA Terra satellite. The time range of the NDVI was from January 2001 to December 2018 (https://lpdaac.usgs.gov/). The data, such as radiation correction, atmospheric correction, and geometric correction, were preprocessed by the data provider. To easily reflect the vegetation changes and process data, we needed to convert the projection to the World Geodetic System 1984 and splice the data from subregions by the row number. This research used the Global Bathymetry and Elevation Data at 30 Arc Seconds Resolution of SRTM (Shuttle Radar Topography Mission) DEM (https://topex.ucsd.edu/WWW_html/srtm30_plus.html). The data were developed by the Scripps Institute of Oceanography, University of California, San Diego. Their grid resolution was 30 seconds, which was about 1 km. Moreover, the aspect and slope were generated by the DEM. To verify the effectiveness of predicted precipitation, we also used the observation data of precipitation from January 2001 to December 2012 at ten stations in the TRB. The observation data of precipitation were downloaded from the Nation Meteorological Information Center (NMIC) (http://data.cma.cn/).

Methods
Firstly, we fit the quadratic polynomial regression of precipitation, DEM, the NDVI, aspect, and slope based on Earth system data products from January 2001 to December 2017. Secondly, we trained the regression through the BGD algorithm of machine learning. Then, we established a statistical downscaling model to downscale the monthly precipitation with a high resolution (1 km × 1 km) from January 2001 to December 2017 in the TRB. Based on the observation data of precipitation, we evaluated the downscaling results by calculating the correlation coefficient, the determination coefficient, the root mean square error, the mean absolute error, and the normalized mean square error [48][49][50][51]. Finally, we used the Mann-Kendall test to examine the variation of precipitation in the TRB during the past 17 years ( Figure 2). The quantities of these methods are defined below in Section 3.3. (https://topex.ucsd.edu/WWW_html/srtm30_plus.html). The data were developed by the Scripps Institute of Oceanography, University of California, San Diego. Their grid resolution was 30 seconds, which was about 1 km. Moreover, the aspect and slope were generated by the DEM. To verify the effectiveness of predicted precipitation, we also used the observation data of precipitation from January 2001 to December 2012 at ten stations in the TRB. The observation data of precipitation were downloaded from the Nation Meteorological Information Center (NMIC) (http://data.cma.cn/).

Methods
Firstly, we fit the quadratic polynomial regression of precipitation, DEM, the NDVI, aspect, and slope based on Earth system data products from January 2001 to December 2017. Secondly, we trained the regression through the BGD algorithm of machine learning. Then, we established a statistical downscaling model to downscale the monthly precipitation with a high resolution (1 km × 1 km) from January 2001 to December 2017 in the TRB. Based on the observation data of precipitation, we evaluated the downscaling results by calculating the correlation coefficient, the determination coefficient, the root mean square error, the mean absolute error, and the normalized mean square error [48][49][50][51]. Finally, we used the Mann-Kendall test to examine the variation of precipitation in the TRB during the past 17 years ( Figure 2). The quantities of these methods are defined below in Section 3.3.

The BGD-Based Polynomial Regression
Precipitation alters the growth of vegetation, and vegetation can influence precipitation variation through seasonal variations [52,53]. Furthermore, precipitation is impacted by terrain, especially in mountains areas [12]. Bringing vegetation variation and terrain information into the downscaling simulation of monthly precipitation could be useful [54]. In this research, we used the DEM, aspect, and slope to represent the regional terrain information, and we used the NDVI to show the regional vegetation variation. We constructed a quadratic polynomial regression model based on the DEM, the NDVI, aspect, slope, and TMPA data to train the regression through the BGD algorithm to build the downscaling model. The downscaling model can be expressed as follows:

The BGD-Based Polynomial Regression
Precipitation alters the growth of vegetation, and vegetation can influence precipitation variation through seasonal variations [52,53]. Furthermore, precipitation is impacted by terrain, especially in mountains areas [12]. Bringing vegetation variation and terrain information into the downscaling simulation of monthly precipitation could be useful [54]. In this research, we used the DEM, aspect, and slope to represent the regional terrain information, and we used the NDVI to show the regional vegetation variation. We constructed a quadratic polynomial regression model based on the DEM, the NDVI, aspect, slope, and TMPA data to train the regression through the BGD algorithm to build the downscaling model. The downscaling model can be expressed as follows: where F(X, Y, Z, K) represents the estimated values of quadratic polynomial regression, X is DEM, Y is the NDVI, Z is the aspect, and K is the slope. ∆P (mm) is the residuals of the precipitation distribution caused by topography and vegetation factors. The cost function of linear regression is: where x (i) represents the feature of the samples, y (i) is the target value, and m is the number of the samples. We trained the dataset to minimize the cost function, which allows one to find the optimal solution and solve the parameter θ of minJ(θ).
The unconstrained optimization problem is the basis of optimization theory, and an iterative method is usually used to find its optimal solutions [41,55]. Classical numerical optimization algorithms such as the gradient descent and Newton methods can allow one to an optimal solution. The gradient descent methods are usually used as the core methods of training algorithms in machine learning to recursively approximate the minimum deviation model. The BGD algorithm is a gradient descent method which is widely used in machine learning. The BGD algorithm searches for the most significant decline of the cost function in each iteration until the change in the cost function is minimized and stabilized. It is different from the Stochastic gradient descent (SGD) algorithm. The SGD algorithm only updates the parameters for one sample and randomly optimizes the cost function on a specific training data so that the training speed is accelerated. However, a single sample does not represent the trend of the entire sample, and the training speed comes at the expense of accuracy [41]. In this paper, we applied the gradient descent method to minimize the cost function, which was needed to find the partial derivative of J(θ): The updating function of θ is: where α is the learning rate and m is the number of samples. The iterative updating of θ uses all samples; this is called the BGD algorithm. The BGD algorithm uses all samples in a data set when iteratively updating parameters. Its time complexity is O (n). To improve the convergence speed, we could scale the feature samples: where µ j is the average value of x j and S j is the standard deviation of the sample.

Specific Steps of the Downscaling Model
(1) We resampled the DEM, the NDVI, aspect, and slope data to a low resolution (LR) of 0.25 • × 0.25 • ; then, the high resolution (HR) became 1 km × 1 km.
(2) Then, we fit the polynomial regression equations for the multi-year average monthly data at the low resolution (0.25 • ) based on the correlations between DEM LR , NDVI LR , aspect LR , slope LR, and TMPA LR . The predictor variables were DEM LR , NDVI LR , aspect LR , and slope LR , and the response variable was TMPA LR . The regression model was trained by the BGD algorithm to obtain the polynomial regression equation and the regression coefficients C LR of the twelve months.
(3) Based on the dataset (at the low resolution) from January 2001 to December 2017, we calculated the estimated values P LR of polynomial regression at a low resolution and subtracted the P LR from the TMPA original data to obtain the residual ∆P LR . Then, we resampled ∆P LR by the ordinary Kriging interpolation to the high resolution of 1 km × 1 km to get the ∆P HR .
(4) Based on the dataset (at the high resolution) from January 2001 to December 2017, we used the DEM HR , NDVI HR , aspect HR , and slope HR as predictor data in polynomial regression at a high resolution to obtain the estimated values P HR. As a result, the predicted precipitation could be calculated by the equation:

The Evaluation of Downscaling Simulation
Based on the monthly precipitation data of the meteorological stations, we calculated the correlation coefficient (R), the determination coefficient (R 2 ), the root mean square error (RMSE), the mean absolute error (MAE), and the normalized mean square error (NMSE) to evaluate the simulation results [49,50,56,57]. The methods are as follows: where y i is the observed value,ŷ i is the predicted value, y i is the average of observed value, and N is the number of samples.

Mann-Kendall Test
The Mann-Kendall test is a nonparametric test which can be used to examine the significance of the trend for climate hydrological time series [58,59]. This paper used the Mann-Kendall trend test to explore the trend of precipitation in the TRB of Northwest China. For a time series Xt = (x 1 , x 2 , . . . , x n ), the statistic S of the Mann-Kendall trend test can be calculated as follows: For a set with a sample size greater than 8, the statistic S is close to a normal distribution with a mean value of 0, and its variances are: where t i is the number of i groups. The calculation formula of the standardized statistic Z c is: where the Z c value is the trend statistics.
Positive values indicate that the tested sequences show an increasing trend, and the negative values indicate a decreasing trend; if the absolute value of the Z c value is greater than the statistical value of 1.64 (95% confidence level-the 95% significant level indicates this test will erroneously detect significant trends 5% of the time), the trend of the sequences is significant. The slope β is commonly used to measure the magnitude and direction of the variation trend, and its formula is: where 1 < j < i < n, and n is the number of the samples. When β > 0, there is an upward trend; otherwise, there is a downward trend. The absolute value of β represents the magnitude of the change.

Downscaling Precipitation
Combining the supervised learning method and the cost function, we trained polynomial regression through the BGD algorithm to fit the regression equations with optimal parameters of 12 months from 2001 to 2017. We used the TMPA, DEM, NDVI, aspect, and slope at a 0.25 • × 0.25 • resolution as the training dataset. The testing dataset consisted of the DEM, NDVI, aspect, and slope, and it had the relatively high resolution of 1 km × 1 km. The initial learning rate of the algorithm α was 0.001, and the initial θ was a random value. The training results of the BGD algorithm are mapped in Figure 3. We compared the descent results at different learning rates within the limited iterations (Figure 3a), which indicates that the optimal learning rate α was equal to one. When α = 0.001 (initial learning rate) and α = 0.01, the cost function was hard to converge. When α = 0.1, the effectiveness of the convergence was unsatisfactory. When α = 1, the effectiveness of the convergence was relatively good. Consequently, we set the learning rate equal to 1. Moreover, we can see that the cost function continuously reduced and eventually stabilized after many iterations (Figure 3b). There were some differences in the relationships between cost function and iterations in each month. The cost functions of January, November, and December were relatively high, but they were relatively low in March, April, May, June, and July. This is because the relationship between vegetation and precipitation in summer and spring was more significant than in autumn and winter. Moreover, the increases in precipitation of the Northwest was mainly due to the variation of precipitation in summer and spring [11].
In order to verify the effectiveness of the downscaling model, we used the DEM, NDVI, slope, and aspect to downscale precipitation based on the 2018 dataset. We selected the observation data of four meteorological stations to test the significance of the downscaling results at different learning rates. We can conclude from Figure 4 and Table 1 that the downscaling results were relatively optimal in the cases where the learning rate was equal to 1. The results showed that the trend of downscaling precipitation in 2018 was mainly consistent with the observation data ( Figure 4). We can see that the four stations had relatively high precipitation in summer. Moreover, the downscaling data presented a good correlation with observation data at the learning rate of 1, with R in a range from 0.8648 to 0.9326, R 2 equal to 0.7470~0.8651, RMSE equal to 1.9327~3.0488, MAE equal to 1.4685~2.5185, and NMSE equal to 0.1236~0.2319; these values indicate that the downscaling model performed well in the downscaling simulation (Table 1). Consequently, we could continue to simulate precipitation with a high resolution were some differences in the relationships between cost function and iterations in each month. The cost functions of January, November, and December were relatively high, but they were relatively low in March, April, May, June, and July. This is because the relationship between vegetation and precipitation in summer and spring was more significant than in autumn and winter. Moreover, the increases in precipitation of the Northwest was mainly due to the variation of precipitation in summer and spring [11]. In order to verify the effectiveness of the downscaling model, we used the DEM, NDVI, slope, and aspect to downscale precipitation based on the 2018 dataset. We selected the observation data of four meteorological stations to test the significance of the downscaling results at different learning rates. We can conclude from Figure 4 and Table 1 that the downscaling results were relatively optimal in the cases where the learning rate was equal to 1. The results showed that the trend of downscaling precipitation in 2018 was mainly consistent with the observation data ( Figure 4). We can see that the four stations had relatively high precipitation in summer. Moreover, the downscaling data presented a good correlation with observation data at the learning rate of 1, with R in a range from 0.8648 to 0.9326, R 2 equal to 0.7470~0.8651, RMSE equal to 1.9327~3.0488, MAE equal to 1.4685~2.5185, and NMSE equal to 0.1236~0.2319; these values indicate that the downscaling model performed well in  (Table 1). Consequently, we could continue to simulate precipitation with a high resolution   According to the validation of this downscaling model, the downscaling effectiveness was acceptable. Then, we used the DEM, NDVI, aspect, slope, and TMPA data from 2001 to 2017 to downscale the monthly precipitation based on BGD-based polynomial regression (the optimal parameters of BGD-based polynomial regression are provided in the Supplementary Materials).

Accuracy Test of Downscaling Results
In order to better evaluate the effect and accuracy of the downscaling model, we compared the downscaling results of polynomial regression and BGD-based polynomial regression with observation data and TMPA in 2001, respectively. On the other hand, we verified the downscaling precipitation from 2001 to 2017 of BGD-based polynomial regression by using the observation data.

Comparing the Downscaling Results of Ordinary Polynomial Regression and BGD-Based Polynomial Regression
In order to evaluate the effectiveness of the downscaling model, we downscaled the monthly precipitation in 2001 by using ordinary polynomial regression and BGD-based polynomial regression. Moreover, we also compared the downscaling results of these two methods with observation data and TMPA. Under the 0.01 significant level, TMPA had a good correlation with observation data in 2001, with R ranging from 0.7162 to 0.9789 and R 2 ranging from 0.5129 to 0.9000. Moreover, the downscaling model based on BGD-based polynomial regression performed better in the simulation than that based on ordinary polynomial regression. As shown in Figure 5 and Table 2, the results of the BGD-based polynomial regression showed similarly variation with observation data and TMPA, and they showed a higher correlation with the observation data than TMPA and ordinary polynomial regression, with R ranging from 0.9156 to 0.9871 and R 2 ranging from 0.8066 to 0.9531. The RMSE, MAE, and NMSE in Table 2 illustrate that the effectiveness of ordinary polynomial regression was relatively inadequate. The low RMSE (0.1904~6.4804), MAE (0.1471~4.7365), and NMSE (0.0474~0.1728) indicate that the effectiveness of BGD-based polynomial regression was acceptable.

The Accuracy Test for Downscaling Precipitation
In order to verify the downscaling results from 2001 to 2017, we extracted the downscaling precipitation of meteorological stations by the nearest neighbor method according to their locations. Then, we selected a total of 204 months of precipitation data (from January 2001 to December 2017) at each station for the verification of downscaling results, and we fit the scatter plots of calibration and validation between observation data and downscaling data. The calibration used the data of 2018, and the validation used the data from 2001 to 2017. The results of validation showed that there were good linear relationships between the downscaling data and observation data of the meteorological stations in the TRB, with R 2 ranging from 0.7243 to 0.9147 ( Figure 6 and Table 3). The correlation coefficients between the observation data and downscaling data of Kumux, Bayanbulak, Korla, Kalpin, Alaer, Tazhong, Tieganlik, Yarkant, Pishan, Hotan were 0.8842, 0.9506, 0.9592, 0.9088, 0.9035, 0.8761, 0.9477, 0.8958, 0.8660, and 0.8878, respectively. All these coefficients passed the 0.01 significant test, which illustrates that the downscaling data had a significant positive correlation with the observation data. Additionally, the RMSE, MAE, and NMSE of the downscaling data and observation data of each station revealed that the downscaling model had relatively good performance in the simulation. The validation results showed that the downscaling simulation had great accuracy.
Overall, the downscaling results were close to the observation data ( Figure 7). The average, maximum, and minimum values were generally captured well in the downscaling simulation. However, there were still some outliers in the downscaling results (Figure 7a). Based on the comparison of monthly variation between downscaling results and observation data, we could see that these outliers were also present in the observation data, mainly in summer. The presentence of outliers in summer indicates the maximum in precipitation during the month, which was mainly due to the ununiform temporal variation of precipitation with high precipitation in summer and low precipitation in spring and winter [1,60]. Summer precipitation took over 50% of the total precipitation in the TRB, and the maximum precipitation in summer affect the overall annual precipitation in the TRB [2,8,61,62]. The downscaling model captured the precipitation extremes well, which means the very high and very low values were captured well in the downscaling results.

Accuracy Test of Downscaling Results
In order to better evaluate the effect and accuracy of the downscaling model, we compared the downscaling results of polynomial regression and BGD-based polynomial regression with observation data and TMPA in 2001, respectively. On the other hand, we verified the downscaling precipitation from 2001 to 2017 of BGD-based polynomial regression by using the observation data.

Comparing the Downscaling Results of Ordinary Polynomial Regression and BGD-Based Polynomial Regression.
In order to evaluate the effectiveness of the downscaling model, we downscaled the monthly precipitation in 2001 by using ordinary polynomial regression and BGD-based polynomial regression. Moreover, we also compared the downscaling results of these two methods with observation data and TMPA. Under the 0.01 significant level, TMPA had a good correlation with observation data in 2001, with R ranging from 0.7162 to 0.9789 and R 2 ranging from 0.5129 to 0.9000. Moreover, the downscaling model based on BGD-based polynomial regression performed better in the simulation than that based on ordinary polynomial regression. As shown in Figure 5 and Table 2, the results of the BGD-based polynomial regression showed similarly variation with observation data and TMPA, and they showed a higher correlation with the observation data than TMPA and ordinary polynomial regression, with R ranging from 0.9156 to 0.9871 and R 2 ranging from 0.8066 to 0.9531. The RMSE, MAE, and NMSE in Table 2 illustrate that the effectiveness of ordinary polynomial regression was relatively inadequate. The low RMSE (0.1904~6.4804), MAE (0.1471~4.7365), and NMSE (0.0474~0.1728) indicate that the effectiveness of BGD-based polynomial regression was acceptable. Kalpin, Alaer, Tazhong, Tieganlik, Yarkant, Pishan, Hotan were 0.8842, 0.9506, 0.9592, 0.9088, 0.9035, 0.8761, 0.9477, 0.8958, 0.8660, and 0.8878, respectively. All these coefficients passed the 0.01 significant test, which illustrates that the downscaling data had a significant positive correlation with the observation data. Additionally, the RMSE, MAE, and NMSE of the downscaling data and observation data of each station revealed that the downscaling model had relatively good performance in the simulation. The validation results showed that the downscaling simulation had great accuracy.

Comparing the Spatial Distribution of Downscaling Results and TMPA
This research input the DEM, NDVI, aspect, and slope data with a 1 km × 1 km resolution from January 2001 to December 2017 into the downscaling model. Based on the downscaling results, we obtained a monthly precipitation dataset with a high resolution (1 km × 1 km) from January 2001 to December 2017 in the TRB. As shown in Figure 8, we partly extracted the downscaling precipitation as a representation (April, August and December 2017) to compare with the original TMPA data of corresponding times. The spatial distribution of downscaling precipitation was consistent with the original TMPA data. The precipitation of TRB was mainly concentrated in the southern foothills of the Tianshan Mountains, and it was scarce in the central plains [1,2,60].

The Temporal and Spatial Variation of Precipitation
Based on the downscaling results, precipitation was mapped ( Figure 9) to analyze its spatial distribution. We were able to illustrate that the precipitation in the TRB from 2001 to 2017 was mainly concentrated in the north and southwest mountainous, and it was scarce in the central and east plains ( Figure 6). The spatial distribution characteristics of precipitation in the basin were similar to those of vegetation cover. [43,52]. The annual average maximum precipitation was about 411 mm, and the annual average precipitation reached about 110 mm.

The Temporal and Spatial Variation of Precipitation
Based on the downscaling results, precipitation was mapped ( Figure 9) to analyze its spatial distribution. We were able to illustrate that the precipitation in the TRB from 2001 to 2017 was mainly concentrated in the north and southwest mountainous, and it was scarce in the central and east plains ( Figure 6). The spatial distribution characteristics of precipitation in the basin were similar to those of vegetation cover. [43,52]. The annual average maximum precipitation was about 411 mm, and the annual average precipitation reached about 110 mm.
The precipitation slope is mapped in Figure 10, which shows the variation of precipitation slope at the high resolution was consistent with the change at the low resolution. In the TRB, the areas where annual precipitation presented an increasing trend account for 72% of the total basins, and the overall variations of annual precipitation in this basin presented an upward trend. The increasing trend of annual precipitation was significantly high in the Kunlun Mountains and in the surrounding mountains of the South Tianshan Mountains. Additionally, the annual precipitation presented a more significant decline in Lop Nor and the North Altun Mountains than other areas. The increased amplitude of precipitation in the TRB from 2001 to 2017 decreased from the northwest to the southeast, with heavy precipitation in the northwest and southwest mountains and rare precipitation in the southeast. The results showed that the annual precipitation around the southern foothills of the Tianshan Mountains and the North Kunlun Mountains presented a significant upward trend, while the annual precipitation in the Lop Nor and the North Altun Mountains exhibited a significant downward trend ( Figure 10). The precipitation slope is mapped in Figure 10, which shows the variation of precipitation slope at the high resolution was consistent with the change at the low resolution. In the TRB, the areas where annual precipitation presented an increasing trend account for 72% of the total basins, and the overall variations of annual precipitation in this basin presented an upward trend. The increasing trend of annual precipitation was significantly high in the Kunlun Mountains and in the surrounding mountains of the South Tianshan Mountains. Additionally, the annual precipitation presented a more significant decline in Lop Nor and the North Altun Mountains than other areas. The increased amplitude of precipitation in the TRB from 2001 to 2017 decreased from the northwest to the southeast, with heavy precipitation in the northwest and southwest mountains and rare precipitation in the southeast. The results showed that the annual precipitation around the southern foothills of the Tianshan Mountains and the North Kunlun Mountains presented a significant upward trend, while the annual precipitation in the Lop Nor and the North Altun Mountains exhibited a significant downward trend ( Figure 10).

Discussion
This paper established a statistical downscaling model based on BGD-based polynomial regression by simulating the relationship between the NDVI, DEM, aspect, slope, and precipitation. Based on the downscaling model, we downscaled the resolution of monthly precipitation in the TRB and investigated its temporal and spatial variation from 2001 to 2017. The cross-validations showed that the statistical downscaling model based on BGD-based polynomial regression could effectively downscale the monthly precipitation at a high resolution in the data-scarce inland river basin of Northwest China.  The precipitation slope is mapped in Figure 10, which shows the variation of precipitation slope at the high resolution was consistent with the change at the low resolution. In the TRB, the areas where annual precipitation presented an increasing trend account for 72% of the total basins, and the overall variations of annual precipitation in this basin presented an upward trend. The increasing trend of annual precipitation was significantly high in the Kunlun Mountains and in the surrounding mountains of the South Tianshan Mountains. Additionally, the annual precipitation presented a more significant decline in Lop Nor and the North Altun Mountains than other areas. The increased amplitude of precipitation in the TRB from 2001 to 2017 decreased from the northwest to the southeast, with heavy precipitation in the northwest and southwest mountains and rare precipitation in the southeast. The results showed that the annual precipitation around the southern foothills of the Tianshan Mountains and the North Kunlun Mountains presented a significant upward trend, while the annual precipitation in the Lop Nor and the North Altun Mountains exhibited a significant downward trend ( Figure 10).

Discussion
This paper established a statistical downscaling model based on BGD-based polynomial regression by simulating the relationship between the NDVI, DEM, aspect, slope, and precipitation. Based on the downscaling model, we downscaled the resolution of monthly precipitation in the TRB and investigated its temporal and spatial variation from 2001 to 2017. The cross-validations showed that the statistical downscaling model based on BGD-based polynomial regression could effectively downscale the monthly precipitation at a high resolution in the data-scarce inland river basin of Northwest China.

Discussion
This paper established a statistical downscaling model based on BGD-based polynomial regression by simulating the relationship between the NDVI, DEM, aspect, slope, and precipitation. Based on the downscaling model, we downscaled the resolution of monthly precipitation in the TRB and investigated its temporal and spatial variation from 2001 to 2017. The cross-validations showed that the statistical downscaling model based on BGD-based polynomial regression could effectively downscale the monthly precipitation at a high resolution in the data-scarce inland river basin of Northwest China.
The learning rate impacted each iteration of the gradient descent algorithm. If the learning speed was too small, the algorithm needed more iterations to find the optimal parameters [42]. If the learning rate was too large, the algorithm probably missed the local minimum of cost function and resulted in an unsuccessful convergence. Usually choosing the appropriate learning rate α in the range of 0.001 ≤ α ≤ 10 [41,42]. We find the optimal learning rate of 1 after iterations and cross-validation. The BGD algorithm needed the updates of all parameters θ in each iteration, so the algorithm convergence was slow and the number of iterations was relatively large. On the other hand, the BGD algorithm could move towards the minimum value of cost function, and it was relatively stable. In this paper, polynomial regression was transformed into the form of linear regression. The cost function J(θ) of linear regression was a quadratic function with the global optimal solution. Therefore, training a regression model through the BGD algorithm could effectively minimize the cost function and find optimal parameters [39,41].
Quadratic polynomial regression could efficiently capture the information of precipitation that varied with terrain, consequently simulating the precipitation variation of complex terrain [12,25]. Constructing a downscaling model by fitting a linear regression model of the DEM, NDVI, and precipitation could simulate precipitation [27], while the relationships between precipitation, topographic factors, and vegetation were complicated in the areas with complex geographical environments [12,25]. We constructed a statistical downscaling model based on BGD-based polynomial regression to simulate monthly precipitation with a relatively high resolution in the TRB. In order to evaluate the effectiveness of the downscaling model, we compared the results of ordinary polynomial regression and BGD-based polynomial regression. The accuracy assessment of downscaling results in 2001 indicates that the BGD-based polynomial regression was more effective in the downscaling simulation. The reason for this is that ordinary polynomial regression was based on least-squares regression, but least squares regression was sensitive to noise [30,31]. The BGD algorithm can be used to solve least-squares problems, as it is less affected by noise and can be applied to large samples [41].
Moreover, the evaluations of downscaling precipitation from 2001 to 2017 showed that the downscaling precipitation based on this statistical downscaling model presented good linear relationships and significant correlations with the observation data. The RMSE, MAE, and NMSE of observation data and downscaling data were small, which illustrates that downscaling data were close to the observed data. The NMSE generally showed the most striking differences among models. If a model has a very low NMSE, then it is well-performing both in space and time [63,64]. Consequently, the NMSE (ranging from 0.0849 to 0.2744; Table 2) of the validation results indicate that the downscaling model performed well in the simulation.
Based on the downscaling precipitation model, precipitation in the TRB was mainly distribute in the mountains and was scarce in the plains. Precipitation was mostly concentrated in the South Tianshan Mountains and the Kunlun Mountains of this river basin, where there is relatively more coverage of vegetation [2,8,60,62]. The central plains had scarce precipitation. The spatial distribution of precipitation was consistent with the previous studies [5,65]. The annual precipitation in the TRB from 2001 to 2017 mainly showed an increasing trend, and the precipitation decreased from the northwest mountains to the southeast desert [2,66]. The upward variation of precipitation in the TRB is closely related to global warming and is deeply affected by the warm and humid climate change in the arid regions of Northwest China. This illustrates the responses of regional climate change to global warming [10,60].
However, there are still some uncertainties in the downscaling model. We used various earth system products from different satellites. The downscaling model can be impacted by atmospheric correction, data accuracy, and surface parameters [67][68][69][70]. Errors in observation instruments and data processing procedures could influence the model [71]. However, these uncertainties may be inevitable in the current technologies. To improve these issues, we can integrate and merge multiple data in later research to improve accuracy and reduce model uncertainty from data sources [72].
According to the accuracy test of the downscaling model, although the downscaling results had some deviations, their effectiveness was acceptable. We can conclude that the statistical downscaling model based on BGD-based polynomial regression can be used to simulate precipitation at a high resolution, and its downscaling result can be applied to investigate climate change in the inland river basins of Northwest China.

Conclusions
Based on earth system data products, this research established a statistical downscaling model by integrating polynomial regression and the BGD algorithm for the precipitation in the data-scarce inland river basins of Northwest China. We selected the TRB as a typical representation of the inland river basin of Northwest China, and we downscaled the monthly precipitation from January 2001 to December 2017 with a high resolution. The conclusions are as follows: (1) The downscaling model based on BGD-based polynomial regression can efficiently downscale monthly precipitation with a high resolution in the data-scarce inland river basins of Northwest China. Based on this statistical downscaling model, we can effectively simulate precipitation at a high resolution (1 km × 1 km).
(2) The downscaling model based on BGD-based polynomial regression is more effective in the downscaling simulation than ordinary polynomial regression.
(3) Spatially, the precipitation from 2001 to 2017 was mainly concentrated in the surrounding mountains and was scarce in the central plains. Temporally, the variation of annual precipitation in the TRB presented a significant increasing trend during the study periods.
(4) Though we selected the TRB as a research target, the geographical environment conditions of the inland river basins in Northwest China are similar. Therefore, this statistical downscaling model can be used to simulate precipitation in other inland river basins of Northwest China.
Author Contributions: J.Z. designed, carried out the analysis and wrote the manuscript. J.X. revised the paper and refined the results, conclusion, and abstract. Y.C. discussed the results. C.W. edited the figures. All authors approved the manuscript.

Conflicts of Interest:
The authors declare that they have no conflict of interest.