1. Introduction
Precipitation is the major component of the global water cycle. It is a key parameter of the ecological, hydrological, meteorological and agriculture systems [
1,
2]. It plays an important role in the energy exchange and material circulation of the Earth surface system [
3]. It is of significant importance to understand the characteristics of precipitation, because it shows great variability both in space and time as compared to other climatic variables. Therefore, its spatial and temporal variability greatly influence vegetation distribution, soil moisture and surface runoff [
4,
5]. In addition, a high-quality precipitation dataset is very important in the development of different ecological and hydrological models at corresponding scales. On top of that, due to certain limiting factors, it is difficult to develop such high-quality dataset(s) from point measurements based on the traditional precipitation, which are as follows: first, the data derived from point measurements heavily depends on field observations [
5,
6]. Second, field observation stations are not uniformly distributed in space and limited mostly to low and medium altitude areas, with the exception of a few precipitation stations at high altitudes. Moreover, their operational capability is relative for a shorter period. Even if longer precipitation records exist from ground-based stations, they are not sufficient to provide coverage for the global/regional applications, due to deficiencies in reliability of the spatial distribution of precipitation [
7], especially over ocean, desert and mountainous areas. Third, a true spatial coverage of precipitation based on the traditional rain gauge observations cannot be obtained [
8], because many river basins around the world are still poorly gauged [
9], or ungauged [
10]. Fourth, it is difficult to effectively reflect the spatial variability of precipitation based on the observation from a finite number of rainfall stations, especially in areas where rainfall stations are sparsely distributed [
11,
12,
13]. Fifth, rain gauge observations can only reflect the point rainfall within a radius around the location of instruments, and the effectiveness of such data is often under question, and adequate validation is further needed [
14,
15].
Recently, the development in remote sensing and geographic information technology has given a new dimension to present precipitation observations [
16,
17,
18], almost at the global scale over a long period, which also reflects the spatial patterns and temporal variability of precipitation [
19]. In this regard, various research institutions and government organizations have developed a series of gridded global precipitation datasets, including Earth observations, in situ datasets and models at both regional and global scales, i.e., the Global Precipitation Climatology Project (GPCP) [
2,
20,
21,
22], the Global Satellite Mapping of Precipitation (GSMaP) project [
23], the Multi-Source Weighted-Ensemble Precipitation (MSWEP) [
24], the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) [
25], the Precipitation Estimation from Remotely-Sensed Information using Artificial Neural Networks-Climate Record (PERSIANN-CDR) [
26], the Tropical Rainfall Measuring Mission (TRMM) [
27,
28,
29], the TRMM Multi-satellite Precipitation Analysis (TMPA) [
30], and the Global Precipitation Mission (GPM) [
31,
32,
33].
Spatial downscaling is a recently developed approach to obtain the high spatial resolution of a variable based on conjugation between the variable at a coarse scale and geospatial predictor(s) at the low resolution [
34,
35]. In this regard, using spatial downscaling techniques on a precipitation dataset may provide a better representation of the spatial variability of precipitation to be used for different purposes. Several authors have used downscaling methodologies to increase the spatial resolution of satellite-based precipitation, often in combination with Earth observations data available on hydro-meteorological variables related to precipitation, including normalized difference vegetation index (NDVI) [
30,
35,
36,
37,
38,
39,
40,
41,
42], digital elevation model [
30,
38,
43,
44], land surface temperature [
30], soil moisture [
37], in situ rain gauged precipitation [
37,
38], slope [
38], aspect [
38], and wind [
31]. Moreover, few authors have used different satellite-based precipitation datasets for TRMM products [
30,
42]. Additionally, some studies used regression analysis with model parameters spatially constant (multiple linear, polynomial, exponential, regression kriging, etc.), assuming a spatial stationarity of the relationship between precipitation and the predicting variables [
34,
35,
38,
41,
44,
45,
46,
47,
48]. On top of that, some studies limited their analysis only to satellite-based precipitation datasets and did not take full advantage of all available data sources, combining remotely sensed and in situ observations [
42,
49,
50].
In this research work, a new downscaling methodology (
Figure 1), based on the earlier work of [
3,
38,
39], such as the GMWPA is developed using DEM (
Figure 2a) to delineate into three geospatial predictors, i.e., elevation, longitude, and latitude [
44], in EDBF algorithm. Two different satellite-based precipitation datasets, such as the GPM-based multitemporal precipitation data (
Figure 2b–i) for the prediction of high-resolution downscaled weighted precipitation from 0.1° to 0.05° resolution, and the GPM (
Figure 2j,k) and the TRMM (
Figure 2l–o) datasets for the verification of proposed methodology is used over the humid (the Southern) region of Mainland China. During the execution, certain objectives are set to achieve the required results, which are as follows [
51]: to evaluate the multitemporal precipitation (2001–2015) dataset through regression analysis, i.e., polynomial regression at different upscaled resolutions, e.g., 0.25°, 0.5°, 0.75°, 1.0°, 1.25°, 1.50°; (2) based on the regression output, EDBF algorithm is run to evaluate the multitemporal precipitation at each upscaled resolution by assigning weight to each temporal component; (3) to verify the output of EDBF algorithm through the TRMM and the GPM datasets; and (4) to generate the high-resolution downscaled weighted precipitation at 0.05° resolution based on the best performing upscaled resolution. This research can have practical implications, particularly for climate change, drought assessment, and water resources planning, which require long-term precipitation estimates at finer resolution.
4. Discussion
In this study, a new downscaling methodology, namely GMWPA at 0.05° resolution, was developed and investigated in the humid region of Mainland China. A two-stepped procedure [
38,
39,
41], based on a scale-dependent regression analysis and downscaling of the predicted multitemporal weighted precipitation at a refined scale, was adopted during the execution of proposed methodology. For this purpose, the multitemporal GPM precipitation dataset (2001 to 2015) at 0.1° and ASTER 30 m DEM-based geospatial predictors, i.e., elevation, longitude, and latitude were taken as input variables to predict the low-resolution—for the residual generation at optimal resolution scale—and the high-resolution weighted precipitation, and were used in the final downscaling process.
Furthermore, the regression analysis was performed in two phases. In the first phase, each geospatial predicator was assessed through developing a relationship (
Table 1) with each individual precipitation variable via a fitting line—polynomial fit. Moreover, it was observed that latitude showed the highest correlation with all precipitation variables and achieved the highest R
2 value. Compared to previous studies [
3,
34,
59] which used either one or two independent variables (NDVI, elevation), the authors in [
38] used several independent variables, i.e., latitude, longitude, elevation, slope, aspect, NDVI, Max_NDVI, Range_NDVI, and Min_NDVI, to establish regression models for deriving the annual precipitation over continental China. From the study, it was concluded that, apart from latitude, all variables including NDVI showed relatively weak empirical relationships with the observed precipitation, especially over the humid region of China. Specifically, for NDVI, a possible reason may be that NDVI-related predictors are better indicator of precipitation in arid and semi-arid areas. The NDVI values would not increase with the increased rainfall amount in humid areas, which makes a relatively weak empirical relationship between precipitation and saturated NDVI. Keeping in view, latitude was selected as the proxy of precipitation and employed in assigning initial weight value (e.g., based on
r value calculated for each precipitation variable with respect to latitude) to each individual precipitation variable from the multitemporal precipitation dataset, and which was then processed in EDBF algorithm [
58] to predict the weighted precipitation.
Likewise, in the second phase, the output precipitation variable from EDBF, e.g., the weighted precipitation was assessed via developing the relationship with latitude through linear fitting. Moreover, the correlation between latitude and the weighted precipitation was increased for each of the low-resolution scale, and the highest R
2 was achieved at 100 km (e.g., between 0.75°, 1.0°, 1.25° resolutions), which showed that the weighted precipitation was well captured by latitude at 100 km resolution. Although the highest correlation between latitude and the weighted precipitation was achieved at 1.0° (100 km), but due to certain reasons, 0.75° resolution was selected as an optimal low resolution (e.g., for the upscaling) during the downscaling process. First, there was not much difference between the two resolution scales for the achieved R
2, i.e., 0.75° (R
2 = 0.7918) and 1.0° (R
2 = 0.7977) resolution. Secondly, 0.75° resolution had more pixels, i.e., 195, as compared to 111 pixels for 1.0° resolution to cover the whole study area. Considering, to convert points into pixels, the Spline Interpolation method [
51,
60] was used, which estimates values using a mathematical function that minimizes the overall surface curvature, resulting in a smooth surface that passes exactly through a specified number of nearest input points while passing through the sample points. Thus, using 0.75° resolution, which had a closer specified number of nearest input points, i.e., 12 points, than 1.0° resolution, tends to produce a smoother surface by minimizing the surface curvature.
From the EDBF algorithm perspective, it is a general framework rather than a specific algorithm, which is easy to implement and can easily accommodate any existing multi-parent crossover algorithms (MCAs). Moreover, the existing MCA-based coefficients [
61,
62,
63] follow a uniform distribution, which also violates constraints, thus propagate error. Errors cascade exponentially, with even a slight increase in the hybrid scale, which leads to the increase in time consumption. To address such problem, EDBF is the best solution which takes multiple MCAs as its constituent members. In addition, the number of iterations during the execution of EDBF algorithm at the low-resolution scale, i.e., 0.25°, 0.50°, 0.75°, 1.0°, 1.25° and 1.50° was set to 3 × 10
4 with the reason that a possible number of iterations be available for the stabilization of convergence before the ending of simulation process. Moreover, the process was repeated for all the low resolutions. Though the convergence stabilized before a 3 × 10
4 number of iterations, still a slight improvement could be observed, and further improvement in the regression value(s) could be expected. Instead, by terminating simulation during the execution, we let simulation process to be completed until the last iteration. Owing to that, the number of iterations was reduced during the simulation of high-resolution (i.e., 0.05° resolution) weighted precipitation, and the convergence was well stabilized within the set number of iterations.
During the verification process, the weighted precipitation was first compared with its contributing multitemporal precipitation variables at all the low and the original resolution scales. It outperformed all input variables for the achieved R
2 and outperformed the annual precipitation and underperformed compared to the seasonal and the monthly precipitation variables for the achieved RMSE. Furthermore, the weighted precipitation was compared with different classified precipitations, extracted either as an individual or grouped variables from the original multitemporal precipitation dataset used in the prediction of EDBF-based weighted precipitation at the original 0.1° resolution. The results are shown in
Table 5, in which the weighted precipitation showed the highest correlation with its predictor (R
2 = 0.772) as compared to other used variables. In addition, the weighted precipitation had a lower RMSE value (e.g., RMSE = 141.113 mm) than the Avg-An (01–15) + Wet Ppt+ Dry Ppt, Avg-An (01–15) + Dry Ppt, Avg-An (01–15) + Wet Ppt, Wet Ppt + Dry Ppt, Avg-An (01–15) and Avg-MT (−01 & −04) Ppt with the observed RMSE value of 179.248, 206.353, 182.762, 178.025, 192.537 and 197.434 mm, respectively. Also, it had a higher RMSE than the Avg-MT Ppt variable, i.e., 135.370 mm. The reason of low RMSE value for the average multitemporal GPM precipitation was that the average output was equally contributed by each precipitation variable from the multitemporal dataset. Out of the eight used variables from the multitemporal precipitation dataset, the five variables consisted of the seasonal and the monthly precipitation, which had lower received pixel precipitation. Adding to this, the number of days counted during each of the seasonal component (e.g., average 90 days) is lower than the annual component (e.g., 365 days) and there is less probability of variation in the seasonal precipitation than the annual precipitation. Despite lower R
2 values, less variability from the mean precipitation was observed in the seasonal and the monthly precipitation as compared to the annual precipitation. On the contrary, the EDBF-based weighted precipitation was mainly predicted on the basis of assigned weights via calculated
r values. In this regard, higher the
r value, the more weight was assigned to that variable and more contribution from that variable in the prediction of weighted precipitation. Additionally, it was compared with neutral variables, wherein it outperformed all comparing variables for the achieved R
2 and RMSE values.
The downscaling methodology applied in this study was mainly based on the work presented in [
39], where the basis function was selected at an optimum resolution and by interpolating the residuals. After successfully applying the proposed methodology, the EDBF algorithm was employed in downscaling of the dry year (2001), the wet year (2004) and the average annual (2001–2015) precipitation at 0.05° resolution by following the same process as for downscaling the
precipitaiton. Before downscaling, a graphical relationship between the weighted precipitiaon and the dry year (2001), the wet year (2004) and the average annual (2001–2015) precipitation was developed through a scatter plot as shown in
Figure 5g–i, respectively. The weighted precipitation showed the highest correlation with the dry year (2001) followed by the average annual (2001–2015) and the wet year (2004) for the achieved R
2 = 0.9869, 0.8929 and 0.4154, respectively.
Moreover, during downscaling, the low-resolution weighted residuals (
Figure S6d–f) were generated by subtracting the low-resolution weighted precipitation
(
Figure 7b) from the original dry year (2001), the wet year (2004) and the average annual (2001–2015) precipitation (
Figure S6a–c) at 0.75° resolution, respectively. Afterward, the high-resolution weighted residuals (
Figure S6g–i) at 0.05° were obtained by interpolating the low-resolution residuals at 0.75° resolution. Finally, by adding the obtained high-resolution interpolated residuals to the high-resolution weighted precipitation (
Figure 7e), the downscaled high-resolution weighted precipitation at 0.05° resolution for the dry year (2001) (
Figure 8d), the wet year (2004) (
Figure 8e) and the average annual (2001–2015) precipitation (
Figure 8f) was obtained. From
Figure 8, it shows that the high-resolution weighted precipitation captured the same precipitation pattern as that of the original GPM dry year (2001), the wet year (2004) and the average annual (2001–2015) precipitation at 0.1°. Moreover, by analyzing the class wise pattern (
Table 6) for the obtained precipitation, the algorithm accurately captured the wet year (2004) (
Figure 8e) and the average annual (2001–2015) precipitation, whereas some classes, e.g., class 4 (gold color) and 5 (light green) were not very well captured during downscaling of the dry year (2001) precipitation, such as between 111° to 115°E and 25° to 27°N, and 117° to 118°E and 24° to 25°N.
Subsequently, to analyze difference in the range of precipitation classes (i.e., difference between the upper and the lower boundary of captured precipitation pattern) between the original dry year (2001), the wet year (2004) and the average annual (2001–2015) precipitation at 0.1° resolution, their corresponding weighted precipitation at 0.05° resolution was found to be in close proximity with the average difference of less than 5 mm for most classes. Apart from that, EDBF algorithm slightly underpredicted extreme precipitation for the dry year (2001) and the wet year (2004) with the average difference of 30 mm, and overpredicted the average annual (2001–2015) precipitation with a difference of 20 mm. On the contrary, for low precipitation EDBF underpredicted the dry year (2001) and the average annual (2001–2015), and overpredicted the wet year (2004) precipitation with the average difference of 10, 4 and 23 mm, respectively. Similarly, considering the individual precipitation variable, EDBF accurately predicted the wet year (2003) and the average annual (2001–2015) precipitation with the average difference of less than 5 mm, whereas it slightly overpredicted the dry year (2001) with an average difference of 10 mm between the original and the corresponding weighted precipitation.