Evaluation of Global Precipitation Products over Wabi Shebelle River Basin, Ethiopia

: This study presents three global precipitation products and their downscaled versions (CHIRPSv2, TAMSATv3, PERSIANN_CDR, CHIRPS_D, PERSIANNN_CDR_D, and TAMSAT_D) estimated with observed values from 1983 to 2014. Performance evaluation of global precipitation products and their downscaled versions is important for accurate use of those measured values in water resource management, climate, and hydrological applications, particularly in the data-sparse Wabi Shebelle River Basin, Ethiopia. Categorical and quantitative evaluation index techniques were applied. The spatial downscaled global precipitation products outperformed raw spatial resolution estimates in all statistical indicators. TAMSAT-D had acceptable performance ratings in terms of RMSE, CC, and scatter plots ( R 2 ). CHIRPSv2 showed the least performance at a daily timestep. Performance of global precipitation products and their downscaled versions increased when daily data were aggregated to the monthly data. CHIRPS-D performed better than other products with a minimum error value (RMSE) and higher CC at a monthly timestep. On the other hand, PERSIANN_CDR_D showed a relatively good performance with a lower, positive Pbias and higher POD values compared to other products for daily and monthly timescales. For spatial mismatch analysis, the bias and RMSE from reference data (individual rain gauge station vs. the average of all available eight stations) against satellite rainfall estimates (PERSIANN_CDR) had a signiﬁcantly different weight, which could be related to the position of the gauge station to provide the “true” spatial rainfall amount. Overall, TAMSATv3 and CHIRPSv2 and their downscaled version satellite estimates showed good performance at daily and monthly timesteps, respectively. PERSIANN_CDR performed best with low Pbias and the highest POD values. Thus, this study decided that the downscaled version of CHIRPSv2 and PERSIANN_CDR-D satellite estimates could be applicable as an alternative to gauge data on a monthly timestep for hydrological and drought-monitoring applications, respectively.


Introduction
Ranfall is an essential and fundamental primary input for the hydrologic cycle, as well as for hydro-meteorological modeling [1][2][3][4]. On the other hand, rainfall data are constrained by poor networks and uneven distribution because of the insufficient budget for operation and installation of rain gauge networks for most parts of the developing climate studies and hydrological applications, they have to be downscaled to fine resolution for matching the sampling of GPPs with gauge data [44].
A comprehensive evaluation of global precipitation products and their downscaled versions, particularly with a spatial mismatch at different timescales, is needed for a better understanding of watershed hydrology; however, this has not been performed for the Wabi Shebelle River Basin to the best of our knowledge. Uncertainty related to the grid-to-point method can be addressed by avoiding the spatial mismatch between the global precipitation product and corresponding station measurement by downscaling the coarse resolution to a fine resolution.
Therefore, this study attempted (i) a comprehensive evaluation of native and downscaled global precipitation products against ground reference rainfall data, and (ii) a quantification of the uncertainty associated with a grid-to-point approach for the spatial scale of global precipitation products at a selected pixel scale.

Study Area Description
The Wabi Shebelle River Basin (WSRB) is one of the largest basins in Ethiopia, located in the southeastern part of the country. It originates from the Arsi and Bale Mountain ranges 4000 m above sea level and drains to the Indian Ocean after crossing Somalia. The basin's absolute location is within the latitudes 4 • 45 -9 • 45 N and longitudes 38 • 45 -45 • 30 E. The WSRB is characterized by bimodal rainfall seasons due to the southern and northern movement of the intertropical convergence zone (ITCZ) from March to May and from July to September. According to the master plan hydrology report, the highest mean annual rainfall recorded is 1467 mm in Seru Wereda of the Arsi Zone. The lowest mean annual rainfall recorded is 220 mm in the Kelafo Area of the Somali Region [23]. In general, the spatial and temporal distribution of rainfall is not evenly distributed; it is clustered in the upper and urban areas of the basin, and tends to decrease with decreasing altitude as shown in Figure 1.
resolution (larger pixel size) than required by climate studies and hydrologica applications, they have to be downscaled to fine resolution for matching the sampling o GPPs with gauge data [44].
A comprehensive evaluation of global precipitation products and their downscale versions, particularly with a spatial mismatch at different timescales, is needed for a bette understanding of watershed hydrology; however, this has not been performed for th Wabi Shebelle River Basin to the best of our knowledge. Uncertainty related to the grid to-point method can be addressed by avoiding the spatial mismatch between the globa precipitation product and corresponding station measurement by downscaling the coars resolution to a fine resolution.
Therefore, this study attempted (i) a comprehensive evaluation of native an downscaled global precipitation products against ground reference rainfall data, and (i a quantification of the uncertainty associated with a grid-to-point approach for the spatia scale of global precipitation products at a selected pixel scale.

Study Area Description
The Wabi Shebelle River Basin (WSRB) is one of the largest basins in Ethiopia, locate in the southeastern part of the country. It originates from the Arsi and Bale Mountai ranges 4000 m above sea level and drains to the Indian Ocean after crossing Somalia. Th basin's absolute location is within the latitudes 4°45′-9°45′ N and longitudes 38°45′-45°3 E. The WSRB is characterized by bimodal rainfall seasons due to the southern an northern movement of the intertropical convergence zone (ITCZ) from March to May an from July to September. According to the master plan hydrology report, the highest mea annual rainfall recorded is 1467 mm in Seru Wereda of the Arsi Zone. The lowest mea annual rainfall recorded is 220 mm in the Kelafo Area of the Somali Region [23]. In genera the spatial and temporal distribution of rainfall is not evenly distributed; it is clustered i the upper and urban areas of the basin, and tends to decrease with decreasing altitude a shown in Figure 1.

Rain Gauge Data
There are about 74 meteorological stations within and around the basin which are not evenly distributed spatially, clustered in upper and urban areas. The rainfall dataset for WSRB was taken from the National Mereological Agency (NMA), covering the period 1983 to 2014. Long-term meteorological data for the WSRB are more complete in upstream parts of the basin, and these stations were taken to analyze precipitation in the area as shown (Figure 2) below.
There are about 74 meteorological stations within and around the basin whi not evenly distributed spatially, clustered in upper and urban areas. The rainfall d for WSRB was taken from the National Mereological Agency (NMA), covering the 1983 to 2014. Long-term meteorological data for the WSRB are more complete in ups parts of the basin, and these stations were taken to analyze precipitation in the a shown ( Figure 2) below.
Rain gauge stations for this study were carefully chosen on the basis of their q control process for climate data (verification of in situ station's geographical coord checking for false zeros, checking for the presence of outliers, and homogeneity te using the Climate Data Tool (CDT) https://github.com/rijaf-iri/CDT (accessed on 2 2020). Twenty-seven out of 74 gauging stations with a percentage of available missing) and continuous data greater than 80% were selected for the compari different GPPs in the study area.

Global Precipitation Products
Global precipitation data with fine spatial and temporal resolution provide op homogeneous timeseries information for data-scarce areas, going back in time (30+ as far as possible for hydrological applications and climate studies [45]. precipitation data are a combined product of reanalysis, rain gauge data, and r sensing estimates. For this desired specific objective, three global precipitation products and downscaled versions, with different temporal and spatial scales, were taken as ( Table 1). The selection of the GPPs was based on public availability, ease of estim global coverage, multiyear period, and previous record of estimate performance.
The Climate Hazards Group Infrared Precipitation with station data versio (hereinafter CHIRPSv2) was developed by the United States Geological Survey ( Rain gauge stations for this study were carefully chosen on the basis of their quality control process for climate data (verification of in situ station's geographical coordinates, checking for false zeros, checking for the presence of outliers, and homogeneity testing) using the Climate Data Tool (CDT) https://github.com/rijaf-iri/CDT (accessed on 27 June 2020). Twenty-seven out of 74 gauging stations with a percentage of available (non-missing) and continuous data greater than 80% were selected for the comparison of different GPPs in the study area.

Global Precipitation Products
Global precipitation data with fine spatial and temporal resolution provide optional homogeneous timeseries information for data-scarce areas, going back in time (30+ years) as far as possible for hydrological applications and climate studies [45]. Global precipitation data are a combined product of reanalysis, rain gauge data, and remote sensing estimates.
For this desired specific objective, three global precipitation products and their downscaled versions, with different temporal and spatial scales, were taken as inputs ( Table 1). The selection of the GPPs was based on public availability, ease of estimation, global coverage, multiyear period, and previous record of estimate performance. The Climate Hazards Group Infrared Precipitation with station data version two (hereinafter CHIRPSv2) was developed by the United States Geological Survey (USGS) and University of California, Santa Barbara (USCB); it merges estimates using blending satellite, global climatology, and gauge observation data from the Global Telecommunication System (GTS). The CHIRPSv2 dataset incorporates 0.05 • spatial resolution with ground reference measurements to generate a daily sequence of data points for an area coverage of 50 • S-50 • N since 1981 [12].
The Tropical Applications of Meteorology using SATellite version three (hereinafter TAMSATv3) estimate, developed by Reading University in the UK, features Meteosat thermal infrared (TIR) fine-resolution observations on a daily timescale employing attuned cold cloud duration (CCD) data measurements for Africa by downscaling pentadal total measurements. The TAMSATv3 estimate incorporates 0.0375 • spatial resolution through ground rainfall measurements to generate timeseries for all of Africa from January 1983 to date [11].
The Precipitation Estimation from Remotely Sensed Information using Artificial Neutral Network Climate Data Record (PERSIANN-CDR) system was developed by the Center for Hydrometeorology and Remote Sensing (CHRS) at the University of California, Irvine (UCI); it uses a neutral function classification procedure to determine the product of precipitation amount for each 0.25 • × 0.25 • grid in an IR temperature spectrum offered by a geostationary satellite. The rainfall product features an area coverage of 60 • S-60 • N globally from 1983 to 2015 [13].

Methodology
This study evaluated the performance of three global precipitation product and their downscaled versions (CHIRPSv2, TAMSATv3, PERSIANN_CDR, CHIRPS_D, TAMSAT_D, and PERSIANN_CDR_D) at different spatial and temporal scales against 27 ground gauge stations from 1983 to 2014. Categorical and quantitative evaluation index techniques were applied to WSRB, Ethiopia.

Grid-to-Point Approach
There are two typical approaches for evaluating global precipitation products, i.e., the grid-to-grid and point-to-grid methods. The first method requires the interpolation of gauge data to grid data, whereby gauge-gridded data are compared with grid data from global precipitation estimates; however, converting points to gridded interpolated data induces an error resulting from the interpolation of an uneven geospatial distribution [46][47][48][49][50]. The second approach involves an immediate comparison of station rainfall data to the respective pixel in which the gauges are located [23,34,41,42]. In an area such as the Wabi Shebelle River Basin, with a scarcely and unevenly distributed gauge network, a pixel-to-point approach is the first choice to assess the GPPs independently, considering the gauge network as representative measurements irrespective of grids from nominated GPP, without considering the location of the station in the grid. Although global precipitation products exist at coarser resolution (larger pixel size) than required by climate studies and hydrological applications, they have to be downscaled to 1 km fine spatial resolution for evaluation with point gauge rainfall in the desired application. The spatial downscaling method and satellite rainfall estimate are the two most critical aspects in determining the accuracy of downscaled findings. In the Upper Tekezie River Basin, bilinear downscaling performed marginally better than the nearest-neighbor method to integrate satellite products with observed rainfall [51]. Other studies also preferred the bilinear downscaling method for smooth interpolated satellite-derived rainfall [52,53]. Therefore, bilinear downscaling was the approach chosen to downscale the spatial resolution of pixels for this study area.
The downscaled global precipitation product is more accurate than the original coarser resolution [54,55]. Therefore, the pixel value of raw spatial resolution GPPs in their downscaled version (0.01 • × 0.01 • ) was compared to gauge measurements.
The grid-to-point method can induce uncertainty in the performance of global satellite precipitation products due to the comparison of two datasets on different spatial scales regardless of the location of gauges in the pixel. The PERSIANN_CDR (0.25 • × 0.25 • ) pixel contains multiple rain gauge stations (greater than 3), which allows investigating the spatial mismatch global precipitation products against station observations for the eastern upper course (blue-colored grid box in study area map.

Evaluation Performance Indices
The quantitative and categorical evaluation indicator methods were carefully selected according to robustness, common usage, and recommendation in previous studies [39]. These performance indicators are described at https://www.cawcr.gov.au/projects/verification/ (accessed on 12 May 2017), implemented within the Climate Data Tool (CDT). Performance was assessed through quantitative evaluation indicators such as the coefficient of determination (R 2 ) (Equation (1)), percentage bias (Pbias) (Equation (3)), bias (Equation (4)), Pearson's correlation coefficient (CC) (Equation (2)), and root-mean-square error (RMSE) (Equation (5)). CC justifies the relationship between the exact values of two variables (independent and dependent). Values range between zero (no correlation) and one (perfect correlation). R 2 measures how well the independent variables explain the dependent variable in a regression. Values range between zero (no correlation) and one (perfect correlation). Bias describes the extent to which the observed value is underestimated or overestimated. The RMSE represents how closely the satellite observation predicts the measured value.
where G i and S i represent the gauge and global precipitation data on the i-th day, i is the index, and S & G are the average values of S i and G i , respectively. The ability of global precipitation estimates to determine the existence of precipitation rates was tested using the probability of detection (POD) (Equation (6)). POD was employed to evaluate the likelihood of the observed precipitation event being correctly detected by the satellite estimate. A dichotomous estimate that says "yes, an event will happen" or "no, an event will not happen" was used to quantify the metrics, as shown in Table 2. For this application, a rainfall threshold value of 1 mm was applied to decide the occurrence of a rainy or non-rainy day [25,33].
where the absolute score of POD varies from 0-1.

Comparison of Global Precipitation Products at Temporal Scale
This section presents a comparison of three global precipitation products and their downscaled versions vs. station data measurements according to the essential subject of gauge representativeness to identify the most reliable products for water resource assessment, climate studies, and hydrological applications across the data-scarce WSRB at different temporal scales for the period from 1983 to 2014.

Daily Comparison
The raw and downscaled global precipitation data were evaluated with observed rainfall at a daily timescale. Global precipitation products and their downscaled versions presented weak performance according to the majority of statistical indicator indices. The downscaled GPPs outperformed the original coarser resolutions as can be seen in Table 3. This result is similar to previous findings [54,55]. This might be due to the accuracy of the original precipitation product and the spatial downscaling method [39]. The RMSE in global precipitation products and their downscaled versions was highest in the southern and northeastern parts of the basin, with values ranging from 4 to 13 mm, as can be seen in Figure 3b. TAMSAT-D performed better than other products with a minimum RMSE for a value of 6.926 mm. The value of Pearson's correlation coefficient (CC) showed a poor relationship for all global precipitation products, but the CC value was relatively higher in the southern and northern parts of the basin, with values between 0.05 and 0.5, as can be seen from Figure 3a. TAMSAT-D showed the best agreement with a higher CC (0.332). The highest coefficient (R 2 = 0.039) was obtained by TAMSTAv3 and TAMSAT_D, as can be seen Figure 5a. The high performance of daily rainfall estimates from TAMSTAv3 and its downscaled version could be due to the loss of localized convective precipitation with the specified threshold value of the study area. This discovery is in line with the findings of previous investigations. CHIRPSv2 and its downscaled version showed the worst performance, as can be seen in Table 3. This could be attributed to the areal discrepancy of gauge observations and satellite estimates, as well as of the retrieval algorithms in disaggregating pentadal data to daily values [56]. On the other hand, PERSIANN_CDR_D showed a relatively good performance with a lower, positive Pbias compared to other products (underestimate), with a value of 3.09%, as presented in Table 3. The spatial distribution of Pbias for PERSIANN_CDR_D (Figure 3c) showed better performance than most stations. The ability of GPPs to detect the occurrence of precipitation events was also evaluated. In general, the downscaled products had better rainfall capability detection than the raw spatial resolution products in terms of the POD categorical statistical indicator. In this context, PERSIANN_CDR-D revealed a higher POD (0.691) than the PERSIANN_CDR precipitation product, as presented in Table 3. Both the raw and the downscaled precipitation products provided reasonably good PODs, varying between 0.25 and 0.893, as shown in Figure 4a. The highest POD and low Pbias indicate that PERSIANN_CDR-D is suitable for capturing the behavior of extreme precipitation events in the Wabi Shebelle River Basin, Ethiopia. The same result was also confirmed by [57]. This could be due to the adjustment of PERSIANN_CDR using GPCP monthly 2.50 precipitation products [13]. CHIRPSv3 showed extremely poor performance according to the categorical statistical indicator values.  The ability of GPPs to detect the occurrence of precipitation events was also evaluated. In general, the downscaled products had better rainfall capability detection than the raw spatial resolution products in terms of the POD categorical statistical indicator. In this context, PERSIANN_CDR-D revealed a higher POD (0.691) than the PERSIANN_CDR precipitation product, as presented in Table 3. Both the raw and the downscaled precipitation products provided reasonably good PODs, varying between 0.25 and 0.893, as shown in Figure 4a. The highest POD and low Pbias indicate that PERSIANN_CDR-D is suitable for capturing the behavior of extreme precipitation events in the Wabi Shebelle River Basin, Ethiopia. The same result was also confirmed by [57]. This could be due to the adjustment of PERSIANN_CDR using GPCP monthly 2.50 precipitation products [13]. CHIRPSv3 showed extremely poor performance according to the categorical statistical indicator values.
(a)  PERSIANN_CDR (downscaled and raw) showed reasonable agreement with the ground reference (R 2 = 0.03). Furthermore, all products were comparatively symmetric to a 45° inclination. According to the CDFs (Figure 5b), all products were not comparatively denser for a 45° inclination. Furthermore, TAMSAT_D and PERSIANN_CDR_D revealed the worst correspondence with the station CDFs. This shows that these products underestimated the distribution for rainfall ≤ 10 mm/day, whereas CHIRPSv2 and CHIRPS_D overestimated the distribution for rainfall ≤ 10 mm/day.  It can be observed that, in general, the downscaled and raw products presented poor agreement with the ground reference data (r < 0.5). The scatter plots and cumulative distribution functions using the average daily timeseries gauge rainfall data against the GPPs were examined (Figure 5a,b). A relatively high coefficient (R 2 = 0.039) was obtained by TAMSTAv3 and TAMSAT_D, and whereas CHIRPSv2 scored the lowest value. PERSIANN_CDR (downscaled and raw) showed reasonable agreement with the ground reference (R 2 = 0.03). Furthermore, all products were comparatively symmetric to a 45 • inclination. According to the CDFs (Figure 5b), all products were not comparatively denser for a 45 • inclination. Furthermore, TAMSAT_D and PERSIANN_CDR_D revealed the worst correspondence with the station CDFs. This shows that these products underestimated the distribution for rainfall ≤ 10 mm/day, whereas CHIRPSv2 and CHIRPS_D overestimated the distribution for rainfall ≤ 10 mm/day.

Monthly Comparison
The accuracy of the global precipitation products in replicating precipitation was further investigated at a monthly timescale, as shown in Figure 6 and Table 4. The results indicate that the performance of GPPs and their downscaled versions increased when daily data were aggregated to monthly data. These findings were also confirmed by [10,58], which evaluated the performance accuracy of aggregated global precipitation products toward a coarser temporal resolution. For example, one study [36] investigated several global precipitation products over Burkina Faso with different temporal resolutions. The results indicated that the categorical and volumetric indicators significantly increased upon aggregating the timescale. Similarly, the authors of [59] evaluated the CHIRPS satellite precipitation estimates over eastern parts of the continent. In the comparison of CHIRPS estimates with ARC2 and TAMSTA, the findings exhibited reasonably better reference estimates at decadal and monthly timescales, with a better skill of detection and lower bias, while TAMSAT performed better at a daily timescale.

Uncertainty Associated with a Pixel-To-To Point Method
In addition to the spatiotemporal investigation, the significant effect of the position of the stations in a pixel on the evaluation of the global precipitation product was analyzed, as shown in Figure 1 (blue-colored grid box). Furthermore, attempts were made to compare a pixel of selected GPPs (PERSIANN_CDR) against reference data, using the spatial average of all existing station data versus individual gauge stations within a pixel. Findings show that the minimum RMSE was obtained for PERSIANN_CDR when comparing the spatial average over each gauge station in the blue-colored box, with an average value of 4.667, as presented in Table 5.
PERSIANN_CDR achieved a reasonable maximum bias (overestimated by 12.6%) for the spatial average in the comparison of two datasets at the pixel level. On the other hand, the maximum bias ranged from, 40% and 31% using individual gauge stations Bisidimo and Fedis, respectively. In the comparison between the spatial average and the individual stations, Deder exhibited the smallest bias, while other stations changed the direction of the bias, with the exception of the Grawa and Bedeno gauge stations.
Generally, in terms of bias and RMSE, spatial averages estimated using rainfall data (eight stations) exhibited considerably different values to the referenced individual rain  The downscaled GPPs outperformed their original coarser-resolution counterparts according to all statistical indicators of accuracy. CHIRPS-D performed better than other products with a minimum error value (RMSE = 53.734 mm) and higher correlation (CC = 0.748). The value of Pearson's correlation coefficient (CC) showed a good relationship for raw and downscaled global precipitation products. Scatter plots using average monthly timeseries gauge rainfall data against the three GPPs and their downscaled versions were generated. The highest coefficient (R 2 = 0.418) was obtained by CHIRPS_D. As the time resolution increased from days to months, the rainfall amount estimated by CHIRPSv2 became increasingly accurate. The best performance of CHIRPSv2 and its downscaled version could be due to the elimination of error as the data were aggregated to a coarser timescale. These findings are consistent with earlier investigations of CHIRPSv2 rainfall data at a monthly timescale [59,60].
PERSIANN-CDR showed the lowest values for RMSE, CC, and R 2 , as can be seen from Table 4 and Figure 6, on the daily timescale. PERSIANN_CDR_D showed relatively good performance with a lower, positive Pbias compared to other products (overestimate), with a value of 1.999%. PERSIANN_CDR_D resulted in the highest POD value of 0.993. CHIRPS_D and PERSIANN_CDR had the second better probability of detection (POD), whereas the TAMSAT group had the lowest value (Table 4). This implies that the performance of satellite estimates was influenced by the algorism and data source used.

Uncertainty Associated with a Pixel-To-To Point Method
In addition to the spatiotemporal investigation, the significant effect of the position of the stations in a pixel on the evaluation of the global precipitation product was analyzed, as shown in Figure 1 (blue-colored grid box). Furthermore, attempts were made to compare a pixel of selected GPPs (PERSIANN_CDR) against reference data, using the spatial average of all existing station data versus individual gauge stations within a pixel. Findings show that the minimum RMSE was obtained for PERSIANN_CDR when comparing the spatial average over each gauge station in the blue-colored box, with an average value of 4.667, as presented in Table 5. PERSIANN_CDR achieved a reasonable maximum bias (overestimated by 12.6%) for the spatial average in the comparison of two datasets at the pixel level. On the other hand, the maximum bias ranged from, 40% and 31% using individual gauge stations Bisidimo and Fedis, respectively. In the comparison between the spatial average and the individual stations, Deder exhibited the smallest bias, while other stations changed the direction of the bias, with the exception of the Grawa and Bedeno gauge stations.
Generally, in terms of bias and RMSE, spatial averages estimated using rainfall data (eight stations) exhibited considerably different values to the referenced individual rain gauges in terms of magnitude. This magnitude difference may be related to the positions of the gauge stations and the uncertainty due to the representativeness of an individual rain gauge in providing the "true" spatial rainfall amount. Furthermore, the authors of [17,20,22,61] examined the variability and gauge representativeness of rainfall retrieved from the global precipitation product and showed the effect of network density on performance assessment. Therefore, it is essential to apply appropriate representative gauge data for the evaluation of products. Uncertainty related to the grid-to-point method can be addressed by avoiding the spatial mismatch between global precipitation products and the corresponding station measurements by downscaling the coarse resolution to a fine resolution [44]. In addition, installing additional rain gauges is strongly recommended within the grid [25].

Conclusions
In the current study, a total of six GPPs, three from raw global precipitation products (CHIRPSv2, TAMSATv3, and PERSIANN_CDR) and three from downscaled global precipitation products (CHIRPS_D, TAMSAT_D, and PERSIANN_CDR_D), were used. A bilinear method was applied to downscale the coarse spatial resolution of GPPs to 1 km resolution pixels. Categorical and quantitative evaluation index techniques were applied to WSRB, Ethiopia. The primary objective of the study was to assess the performance of the global precipitation products and their downscaled versions at different temporal scales compared to ground gauge stations.
The results indicated that the performance of global precipitation products is affected by factors such as the gauge density, spatiotemporal scale, and type of satellite algorithm. The daily evaluations were executed poorly in the majority of gauge stations. According to the evaluation parameters at the daily timescale, the downscaled GPPs performed best in terms of all statistical indicators. The evaluation assessment clearly indicated that TAMSAT_D was the best performer in terms of RMSE, CC, and scatter plots (R 2 ). On the other hand, PERSIANN_CDR_D showed a relatively good performance with a lower, positive Pbias and higher POD values compared to other products. CHIRPSv2 showed the worst performance at a daily timescale. The results indicated that the performance of the GPPs and their downscaled versions increased when daily data were aggregated to monthly data. Therefore, CHIRPS-D performed better than other products with a minimum error value (RMSE) and higher CC and R 2 . However, PERSIANN_CDR_D presented a low Pbias and the highest POD values on daily and monthly timescales. In spatial mismatch analysis, the bias and RMSE estimated using rainfall data from individual rain gauges exhibited different magnitudes over the spatial average for PERSIANN_CDR, indicating that individual gauge data could not accurately estimate the product.
Overall, the performance of downscaled global precipitation products was better than that of the coarser-resolution products according to all statistical parameters. TAMSAT-D and CHIRPS-D products were the best-performing GPPs in reproducing the daily and monthly rainfall data, respectively. PERSIANN_CDR also accurately captured the extreme rainfall over the study area. This study provides a relatively long consistent and homogeneous timeseries rainfall dataset for climatology analysis and hydrological applications with a 1 km resolution for the study area. Although satellite precipitation products provide information at a high spatial resolution, they are lower in precision. On the other hand, gauges provide accurate point measurements but have limited spatial representativity. Therefore, for future studies, we recommend merging the downscaled product to improve the data availability in terms of accuracy, spatial distribution, and accumulated rainfall volume over the data-scarce Wabi Shebelle River Basin, Ethiopia, with a complex terrain, as well as other regions with a similar climate and topographical location.  Data Availability Statement: All data models and code generated or used during the paper in the summited article.