Evaluation of GPM-Era Satellite Precipitation Products on the Southern Slopes of the Central Himalayas Against Rain Gauge Data

: The Global Precipitation Measurement (GPM) mission provides high-resolution precipitation estimates globally. However, their accuracy needs to be accessed for algorithm enhancement and hydro-meteorological applications. This study applies data from 388 gauges in Nepal to evaluate the spatial-temporal patterns presented in recently-developed GPM-Era satellite-based precipitation (SBP) products, i.e., the Integrated Multi-satellite Retrievals for GPM (IMERG), satellite-only (IMERG-UC), the gauge-calibrated IMERG (IMERG-C), the Global Satellite Mapping of Precipitation (GSMaP), satellite-only (GSMaP-MVK), and the gauge-calibrated GSMaP (GSMaP-Gauge). The main results are as follows: (1) GSMaP-Gauge datasets is more reasonable to represent the observed spatial distribution of precipitation, followed by IMERG-UC, GSMaP-MVK, and IMERG-C. (2) The gauge-calibrated datasets are more consistent (in terms of relative root mean square error (RRMSE) and correlation coe ﬃ cient (R)) than the satellite-only datasets in representing the seasonal dynamic range of precipitation. However, all four datasets can reproduce the seasonal cycle of precipitation, which is predominately governed by the monsoon system. (3) Although all four SBP products underestimate the monsoonal precipitation, the gauge-calibrated IMERG-C yields smaller mean bias than GSMaP-Gauge, while GSMaP-Gauge shows the smaller RRMSE and higher R-value; indicating IMERG-C is more reliable to estimate precipitation amount than GSMaP-Gauge, whereas GSMaP-Gauge presents more reasonable spatial distribution than IMERG-C. Only IMERG-C moderately reproduces the evident elevation-dependent pattern of precipitation revealed by gauge observations, i.e., gradually increasing with elevation up to 2000 m and then decreasing; while GSMaP-Gauge performs much better in representing the gauge observed spatial pattern than others. (4) The GSMaP-Gauge calibrated based on the daily gauge analysis is more consistent with detecting gauge observed precipitation events among the four datasets. The high-intensity related precipitation extremes (95th percentile) are more intense in regions with an elevation below 2500 m; all four SBP datasets have low accuracy ( < 30%) and mostly underestimated (by > 40%) the frequency of extreme events at most of the stations across the country. This work represents the quantiﬁcation of the new-generation SBP products on the southern slopes of the central Himalayas in Nepal. Himalayas, Nepal and compare both satellite-only (IMERG-UC and GSMaP-MVK) and gauge-calibrated (IMERG-C and GSMaP-Gauge) products for their accuracy and discrepancies with 388 gauges measurements from March 2014 to December 2016. Conventional statistical metrics and categorical scores were used to quantify the performances of these SBP products.


Introduction
Precipitation is a vital component of the water cycle, and understanding the characteristics of precipitation is essential for hydro-meteorological applications [1,2]. In mountainous regions, water resource management is further challenging due to the complex climate associated with topographic variance [3]. In these regions, the occurrences of hydrological hazards such as floods, landslides and soil erosion are very sensitive to precipitation amounts. Thus, reliable and precise estimates of precipitation are a prerequisite for hydro-meteorological and natural disaster studies [4,5].
Nepal lies on the south-central part of the main Himalayan range, with more than 80% of the country covered by mountains; in this environment, there is a high probability of landslides and debris flows during the monsoon season. Precipitation in the country is extremely variable due to the complex topography. The seasonal cycle is predominantly governed by the monsoon system [6,7] with maximum (~80%) precipitation occurring in summer. Rain gauge-based measurements provide relatively accurate measurements of precipitation on the ground surface [8,9]. These observations developed by the Department of Hydrology and Meteorology (hereafter, DHM) in Nepal are relatively dense in the lowlands but sparse in high mountain areas [10,11]. The scarcity of rain gauge observations is a major challenge in hydro-meteorological studies and for effective water and disaster management. This scarcity of measurements also limits knowledge of precipitation patterns across the country [12]. Fortunately, high-resolution satellite-based precipitation (hereafter, SBP) products provide potential alternatives for monitoring precipitation on regular high-resolution grids, yielding unprecedented levels of detail especially over remote areas and mountainous regions where stations are very sparse. However, these estimates are indirect measurements and must be verified and calibrated using gauge observations before further application [13,14].
SBP estimates are based on various remotely sensed characteristics of clouds, such as cloud-top temperature (IR imagery), reflectivity (visible) or from the scattering effects of ice particles on passive microwave (PMW) radiation [15][16][17][18]. In the post-Tropical Rainfall Measuring Mission (TRMM) era, the Global Precipitation Measurement (GPM) Core Observatory spacecraft, equipped with advanced sensors and channels, like the Dual-frequency Precipitation Radar (DPR) and the GPM Microwave Imager (GMI) which had capabilities to sense light rain and snowfall, was launched on 27 February 2014 in a collaboration between NASA and the Japan Aerospace Exploration Agency (JAXA) [16]. New SBP products were introduced after the GPM mission: the Integrated Multi-satellite Retrievals for GPM (IMERG) [19,20]; meanwhile, JAXA updated to a newer version of the Global Satellite Mapping of Precipitation (GSMaP) product (GSMaP Version 07) with orographic rainfall correction [21].
Besides several global studies, only a few studies have evaluated the SBPs in a topographically challenging region like Nepal. For example, the TRMM precipitation product shows negative bias (underestimation) as compared to gauge observations over the Himalayan region of the country [41]. Similarly, Islam et al. [12] found comparable results for 15 stations across the country. In contrast, Duncan and Biggs, [42] indicated that the TRMM product generally overestimated (positive bias) the precipitation as compared to a gauge-based gridded product (the Asian Precipitation Highly Resolved Observational Data Integration Towards Evaluation: APHRODITE) over Nepal. In mountainous regions the TRMM (3B43) precipitation product shows reasonable skill, while the GSMaP, the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Network (PERSIANN), and the Climate Prediction Center Morphing Method (CMORPH) products showed considerably weaker performances in reproducing gauge-observed precipitation amounts [6]. A study in a high-elevation area (Khumbu Himalayas) of Nepal compared seasonal and diurnal variations of precipitation in TRMM (3B42), PERSIANN, CMORPH, and GSMaP products using hourly gauge observed precipitation [43]. They found that GSMaP performed poorly, while TRMM, PERSIANN and CMORPH had good agreement with rain-gauge data. Recently, Derin et al. [44] evaluated the GPM-era SBP products over different complex terrain areas, including ten stations from Nepal. They found that GSMaP-V07 was better for measuring the orographic precipitation and precipitation amount as compared to IMERG-V06B after the orographic rainfall classification ensemble in the GSMaP algorithm. The authors also noticed the better performance of IMERG-V05B to capture the light and heavy precipitation amount as compared to IMERG-V06B for the evaluated regions.
Most of the past studies in Nepal were based on previous-generation satellite products, which showed that errors in SBP estimates were partially related to the rugged topography as their algorithms could not detect orographically-induced precipitation appropriately. Additionally, the local climate and nature of the topography are some of the dominant factors to characterize the uncertainty of SBP products [35,[45][46][47]. However, a systematic evaluation of the new-generation SBP products, and a intercomparison between these products, has not yet been performed at the national scale. Thus, in this study, we aimed to comprehensively evaluate four precipitation datasets from the two SBP products, i.e., GPM-era IMERG (V06B) and GSMaP (V07), against 388 gauge observations concerning their spatial and seasonal accuracy over Nepal. Their performances are analyzed for their tendencies and discrepancies depending on the different elevation range, and relative intensities on a daily and monthly timescale from March 2014 to December 2016. Moreover, the accuracy of East to West diversion of monsoon and extreme wet events on these SBP products are also analyzed. The result of this study will help to provide critical scientific references to choose the appropriate product for future scientific research.

Study Area
Nepal is a country located on the southern slopes of the central Himalayas at 26.36 • -30.45 • N latitude and 80.06 • -88.2 • E longitude ( Figure 1). Approximately 80% of the country comprises hills and mountains, and the remaining 20% is flatlands. The elevation of the country rises abruptly from the southern lowlands to the higher Himalayas giving rise to complex topography, weather and climate. Physiographically, the country is broadly classified into Terai (lowlands), hills and mountains [42,48]. The south Asian monsoon system and westerlies regulate the climatology of the country, with maximum precipitation in summer season (80%, June-September) followed by spring (March-May), autumn (October-November), and winter seasons (December-February) [49]. Usually, the monsoon advances from the east of the country in early June and promotes rainfall to the whole country within ten days [50]. During the winter, under the westerly-controlled climate in the western region, the country only receives about 3% of its annual precipitation [51]. Pre-and post-monsoon seasons are generally dry and hot, while the winter is cold and precipitation is generally in the form of snow, especially in high-elevation mountain areas [52].

Rain Gauge Data
The daily precipitation dataset from 387 stations, between March 2014 and December 2016, were obtained from DHM Nepal (https://www.dhm.gov.np/contents/resources). The DHM stations provide daily datasets at 03 UTC. For the consistent measurement, DHM uses the same type of the United States standard eight-inch diameter manual rain gauges [53]. In addition to DHM datasets, data from a high-elevation Automatic Weather Station (AWS) located in the Everest region (27.95 • N to 86.20 • E, 5050 masl), Pyramid was also used. In total, 388 stations' data were used for evaluation of SBP products (Figure 1), which were further subjected to quality control. The remote location, unavailability of AWS, lack of regular monitoring and maintenance of rain-gauge stations are the primary causes for discontinuities in the data series. Data coverage (%) at each station between March 2014 and December 2016 is presented in the Supplementary Materials (S1). The observations from 125 gauges were used in the development of the Global Precipitation Climatology Centre (GPCC) product, which was, in turn, used to calibrate IMERG precipitation totals [55]. About 54 stations also belonged to the Global Telecommunications System (GTS) and were used for the National Oceanic and Atmospheric Administration (NOAA)/CPC analysis, which was in turn used to calibrate the daily GSMaP-Gauge product [56]. Therefore, there is a potential dependency problem between the gauge observed and gauge-corrected SBP product used in this study. The mean precipitation (mm/day) at each rain-gauge station during the study period is presented in Figure 3a.

Satellite Datasets
IMERG is NASA's level-3 multi-satellite GPM product. After the GPM mission, four different versions (V03, V04, V05, and V06) of the IMERG algorithm were developed. IMERG combines retrievals from PMW and microwave-calibrated infrared (IR) to produce a high-resolution global SBP product. Compared to earlier versions (V03 and V04), IMERG V05, and V06 inter-calibrate all individual PMW satellite estimates from Goddard PROFiling (GPROF2017) scheme [57]. Until V05B, the Lagrangian time interpolation scheme was computed from IR data, while in V06, a new model-based morphing scheme in which motion vectors are retrieved from Modern-Era Retrospective Reanalysis 2 (MERRA-2) and Goddard Earth Observing System Model (GEOS) Forward Processing (FP) data using total column water vapor (TQV) [58]. For the first time, IMERG V06 used Precipitation Retrieval and Profiling Scheme (PRPS) Sounder for Atmospheric Profiling of Humidity in the Intertropics by Radiometry (SAPHIR) only to the combined Ku-swath DPR/GMI product (CORRA) [20]. These intercalibrated estimates were then merged into a single PMW SBP estimate and used to calibrate the IR-based precipitation. IMERG uses a Kalman filter-based method to combine the observed PMW, propagated Pulse Width Modulation (PWM), and IR estimates into a single, best estimate. IMERG provides three different products, Early, Late and Final runs, with a latency of~4 h,~14 h, and~3.5 months, respectively. The only forward direction of the cloud motion vector propagation algorithm was adopted in the Early run. In addition to Early, backward morphing was added in the Late run. Meanwhile, in the Final run, climatological calibration coefficients were added on the basis of Late run [33]. The rain-gauge data were not assimilated in Early and Late run, while the Final product was adjusted using 1 • GPCC gauge analysis which is interpolated to 0.1º and applied equally to every half hour in the month. The Final run is mostly recommended for research purposes. IMERG applies the wind-loss correction scheme [59] to the GPCC gauge analyses during the calibration process. The latest IMERG version 06 products were released to the public in March 2019. The Final version of the IMERG product includes two precipitation fields, precipitationUncal and precipitationCal. The Early and Late precipitation products are identical to the Final precipitationUncal since the gauge correction is only applied to the precipitationCal field in the Final product. IMERG V6B had an upgrade of full intercalibration to GPM combined instrument datasets (2BCMB), also in this version the input precipitation rates were increased from 50 to 200 mm/h to adjust fractional coverage [44]. For this study, data from precipitationUncal (hereafter IMERG-UC) and precipitationCal (hereafter IMERG-C) of 0.1 • spatial resolution from the Final run IMERG version 06B between March 2014 and December 2016 were obtained from the PMM website (https://gpm.nasa.gov/data-access/downloads/gpm). A detailed description of the IMERG algorithm can be found in Huffman et al. [60].
GSMaP is a SBP product developed by Japan Science and Technology (JST) under the Core Research for Evolutional Science and Technology (CREST) program [61,62].
To provide high-precision precipitation products, GSMaP combines various available PMW and IR sensors [63]. In the development of the GSMaP precipitation products, the instantaneous precipitation rate is first archived from the PMW radiometers based on various satellite platforms, such as GMI, advanced microwave sounding unit-A (AMSU-A), advanced microwave scanning radiometer-2 (AMSR-2), TRMM Microwave Imager (TMI), microwave humidity sounder (MHS), and special sensor microwave imager/sounder (SSMIS). Further, the gaps between the PMW-based estimates are propagated using atmospheric moving vector (cloud motion vector) calculated from successive IR images. In addition, a new Kalman filter model is applied to refine precipitation rates after the propagation [61]. Finally, forward and backward propagated precipitation estimates are weighted and combined to produce the GSMaP-MVK product. GSMaP-MVK also uses IR to correct satellite estimates but adopts different PMW imagers and sounders. In addition to PMW and IR, GSMaP-MVK estimates are adjusted using daily 0.5º NOAA/CPC gauge-based analysis to develop GSMaP-Gauge precipitation product; this reduces precipitation biases and has a latency of 3 days [64]. In addition to the GPM/DPR database, orographic rain correction classification also introduced in the algorithm of GSMaP-V07. In the current study, the Version 07 satellite-only (GSMaP-MVK) and gauge adjusted (GSMaP-Gauge) hourly datasets with a 0.1 • spatial resolution were used. An overview of the selected datasets in the current study is provided in Table 1.

Quality Control
All four SBP products are gridded datasets, and the gauge observed datasets are at the point scale, i.e., fixed at a single location on the ground surface. Therefore, a point-to-pixel comparison was performed to compare the point-based gauge observed data with the gridded precipitation datasets [65][66][67][68]. We extracted SBP estimates for the station locations instead of interpolating the gauge observations to avoid accumulating additional errors by gridding the observed data [69][70][71]. These SBP rates were first aggregated to obtain daily timescale records using DHM daily precipitation measurement time windows (03 UTC). Some of the station data feature missing values, and quality control was conducted for data consistency; if the gauge-observed daily data contained missing values, then the corresponding daily SBP data were simultaneously considered to be a missing value. The monthly data were computed when the station had more than 25 days of precipitation data available in a month; otherwise, the precipitation in that month was considered as a missing value. Similarly, the monthly SBP data were also considered a missing value for consistency if the corresponding monthly data were missing from the gauge observed datasets.
Mean, and summer (JJAS) mean monsoonal precipitation of SBP and gauge observed datasets during the study period were calculated for each station. Mean monthly regional datasets were computed for three regions, with stations located at longitudes of 80-82 • E, 83-85 • E and 86-88 • E being grouped together as the western, central, and eastern regions, respectively ( Figure 1). The stations were divided into three different physiographic regions (elevation intervals) to quantify the spatial patterns of SBP products. All stations below 1500 m, between 1500 and 2500, and above 2500 m, were aggregated into the low-elevation, mid-elevation, and high-elevation regions, respectively.
We also classified the gauge observed summer monsoon precipitation based on precipitation rate; all stations with mean monsoonal precipitation rates of less than 10 mm/day, between 10 and 20 mm/day, and higher than 20 mm/day was assigned to be lower, moderate, and higher precipitation zones, respectively. This allows analysis of SBP performances for different precipitation rates. The high-intensity related extreme rainfall events in four SBP datasets exceeding the 95th percentile (R95p) of observed precipitation was also examined. To do this, only those stations were selected when daily observed data was available more than 90% per year. The overall processes followed in the study is shown in Figure 2.

Statistical Analysis
Several statistical metrics were calculated to quantify the accuracy or differences between observation and estimated precipitation from SBP products based on monthly scale data. The correlation coefficient (R, Equation (1)) was used to measure the strength and direction of the linear association between datasets. The Relative Root Mean square Error (RRMSE, Equation (2)) which reflects the average magnitude of the deviation that a dataset will have from the gauge observed data; and mean bias (MB, Equation (3)) and absolute relative error (RE, Equation (4)) which measure any persistent tendency of a dataset to either overestimate or underestimate and the discrepancies between the magnitude of the estimated precipitation and the gauge observed dataset. Graphical plots and different statistical measures were used to facilitate the inter-comparison between the SBP and gauge observed datasets. The formulae for the statistical metrics are: where O is the gauge observation data, E is the estimated precipitation data using SBP products, O and E denotes the average value of their respective datasets, and n is the sample size. The perfect score for RRMSE, MB, and RE is~0, while for R is 1. Additionally, a daily performance assessment was calculated for all SBP data based on categorical statistics. These statistics were computed for the individual stations to quantify the capacity to detect daily precipitation events. The statistics are based on a contingency table (Table 2) with two possible cases: a day with or without precipitation. In Table 2, a and d indicate the total events above 1 mm/day recorded by both datasets (gauge observed and SPB), while c and d indicate the total events recorded by both datasets below this threshold. This threshold value was selected to avoid the measurement error from the manual gauge system for the light precipitation amount (less than 1 mm/day). In this study, three categorical indices were used for the assessment: the probability of detection (POD, Equation (5))-SBP's capacity to forecast the precipitation events accurately, and ranges from 0 to 1 (with 1 being an accurate score); False Alarm Ratio (FAR, Equation (6)), which represents how often the SBP's falsely detect a gauge observed precipitation event and ranges from 0 to 1 (with 0 being a perfect score); and Accuracy (ACC, Equation (7)), which is the fraction of all SBP product-based events that were correct, this has values ranging from 0 to 1, with 1 being a perfect score. All these metrics were computed for individual stations using respective daily precipitation series for the study period, with a threshold value of 1 mm/day to separate precipitation and no-precipitation events. The formulas for these statistical metrics are as follows:

The Spatial Pattern of Precipitation in Nepal
The spatial distributions of daily mean precipitation (mm/day) in observations and four different SBP datasets during the study period are presented in Figure 3. The observed datasets show large spatial variability of precipitation across the country. The highest mean precipitation amount (>10 mm/day) was observed in the Lumle areas (28.3ºN, 84ºE), whereas the low amount (<2 mm/day) in the high-elevation areas of central and western region (Figure 3a). Since the lowest precipitation area is located in the high-elevation areas of the central region, the high mountains remarkably block the atmospheric moisture from moving northward and considerably increase (decrease) precipitation in the southern (northern) slope of the central region. In the comparison of observed spatial distribution with the SBP datasets, all four SBP datasets generally showed the main characteristic, in which the high precipitation occurs in central Nepal. However, they differed largely in precipitation totals and location accuracy. The mean precipitation distribution from GSMaP-Gauge shows very similar characteristics, with the maximum precipitation (approximately 10-12 mm/day) at 28.3ºN, 84ºE (Figure 3e), whereas GSMaP-MVK shows the maximum precipitation (approximately 5-7 mm/day) at 28.5ºN, 84ºE (Figure 3c). In contrast, the IMERG-UC shows the maximum precipitation (approximately 10-12 mm/day) at 27.9ºN, 84.8ºE (Figure 3b), while IMERG-C shows high precipitation (approximately 4-5 mm/day) at 28.2º N, 84ºE (Figure 3d). Another area (26.5ºN, 88ºE) of the highest rainfall in IMERG-C might be associated with the monsoon trough as seen over the lower ranches of the eastern region (Figure 3d). Notably, all four datasets are drier (<2 mm/day) in the high-elevation areas of central and western region (Figure 3b-e). The large scale patterns of the precipitation such as heavier orographic precipitation along with the southern slope of mountain ranges in the central region and lower precipitation (<2 mm/day) over the northern slope of central and western region (rain-shadow areas) is also qualitatively captured by all four satellite precipitation datasets. It is worthy to note that, IMERG-UC showed better agreement with the spatial pattern and amount of precipitation in the observations than the IMERG-C, whereas GSMaP-Gauge showed significant improvement over the GSMaP-MVK. Moreover, all four SBP datasets tend to underestimate the mean precipitation across the country. GSMaP-Gauge (Figure 3e) well reproduces the overall spatial pattern of mean precipitation followed by IMERG-UC, GSMaP-MVK, and IMERG-C, respectively. The results suggest that gauge correction scheme for IMERG product requires further improvement in the study area.  The mean precipitation in the winter (December to February) was heavier over the western (0.74 mm/day) region than in the central (0.54 mm/day), and eastern regions (0.28 mm/day). In contrast, western Nepal was drier (4.90 mm/day) than the central (7.02 mm/day) and eastern regions (6.85 mm/day) during the other seasons (March to November) due to the influence of summer monsoon. The precipitation in winter is primarily influenced by the westerlies system and is more pronounced in the western part of the country, while, moisture transfer from Bay of Bengal (monsoon) produces the widespread precipitation during the monsoon season (JJAS) over the country. All four SBP datasets show higher precipitation during the summer monsoon and lower precipitation in winter, with the maximum in July except for GSMaP-MVK in the eastern region (Figure 4c). The satellite-only datasets overestimated the precipitation during winter and pre-monsoon season; however, after the gauge calibration, the positive bias was reduced and is more consistent with observed datasets. Figure 4 indicates that among all four SBP datasets, the gauge calibrated datasets (i.e., IMERG-C and GSMaP-Gauge) represent well the seasonal precipitation variation across all three regions of Nepal, although they all yield underestimations. However, all four SBP datasets well captured the seasonal precipitation dynamics across the country.
For a detailed analysis, the statistical metrics of the four SBP datasets from 2015 to 2016 were calculated against the station observations (Table 3). In the western region, IMERG-UC and GSMaP-MVK showed smaller MBs than their gauge-calibrated datasets, i.e., IMERG-C and GSMaP-Gauge, respectively. Nevertheless, both gauge-calibrated datasets showed better overall performance as indicated by lower RRMSE and higher R-value (Table 3). For the central region, IMERG-UC showed the smaller MBs of −0.93 mm/day than that of −1.48 mm/day in IMERG-C; however, both have proximal RRMSE. Meanwhile, among all SBP, GSMaP-Gauge outperformed GSMaP-MVK and both IMERG datasets as indicated by the lowest MBs and RRMSE (Table 3). Both gauge-calibrated datasets showed very similar MBs with their corresponding satellite-only in the eastern region, although gauge calibrated IMERG-C performed more consistently, with a smaller RRMSE of 0.18, followed by GSMaP-Gauge. In the whole country, among all products, IMERG-UC showed the smallest MB of −0.47 mm/day and IMERG-C showed the lowest RRMSE of 0.28. It is worth noting that the positive bias in IMERG-UC and GSMaP-MVK between January and June later reduces the negative bias during July to October and shows smaller MBs among the datasets (Figures 4a-c  and A1). The seasonal performances of all four SBP datasets were calculated to check the consistency in different seasons and presented in Table A1. The seasonal performance also showed that gauge calibrated datasets well represent the seasonal dynamics than satellite-only as indicated by lower MBs, RRMSE and higher R-value in Table A1. In general, all four datasets generally exhibited high correlations (R > 0. 80), which indicate that the seasonal precipitation dynamics can be captured across the country by all four datasets. Table 3. Statistical metrics in the western, central, and eastern regions, as well as in the whole study region, derived from the regional monthly mean precipitation (mm/day) from 2015 to 2016. Bold font indicates the best performance for a given metric.

Elevation Dependency
The knowledge of the elevation gradient of precipitation is vital for many hydro-meteorological applications. As known that a larger portion of the precipitation occurs during the summer monsoon season, thus the elevation dependency was investigated based on monthly data from the summer monsoon season. The mean precipitation data from observed and four SBP were averaged over summer monsoon at different elevation ranges in every 500 m from 60 m to below 3000 m during the study period ( Figure 5). The number of stations above 3000 m is very limited; thus, the elevation dependency was only calculated below 3000 m. The gauge observations reveal an evident elevation dependency of precipitation, as shown in Figure 5 for the monsoon period. Gauge observations show that precipitation gradually increases with increasing elevation up to 2000 m, and then decreases rapidly (black line, Figure 5a). The highest precipitation (approximately 13 mm/day) occurs in the range 1500-2000 m during the summer monsoon. These patterns are similar to the results revealed by the previous study conducted using gauge observations [72]. IMERG-C moderately captured this evident elevation-dependent pattern, with the highest precipitation (approximately 10 mm/day) in the elevation range of 1500-2000 m. In contrast, other three SBP products failed to capture this pattern; IMERG-UC and GSMaP-MVK showed the highest precipitation at the lowest elevation (below 500 m), and GSMaP-Gauge shows the highest precipitation in the elevation range 500-1000 m. In the higher elevation areas (above 3000 m) with limited stations (14), GSMaP-Gauge and both IMERG datasets overestimated the observed precipitation (not shown in Figure 5). This could be associated with the complex terrain and orographic effect [73,74]. It is worth to note that, orographic rain corrected GSMaP-Gauge showed the variation in precipitation amount for different elevation intervals (i.e., precipitation increase and decrease pattern) [75]. Additionally, we calculated the elevation dependency of SBP datasets by averaging the precipitation across all grid boxes within different elevation ranges (Figure 5b). The numbers of stations and grid boxes in different elevation ranges are listed in Table 4. Grid-based elevation dependency of SBP showed a similar pattern to that of the point-pixel results, but with slightly different precipitation amounts. Overall, both gauge satellite-only IMERG-UC and GSMaP-MVK significantly underestimated the monsoonal precipitation amount. However, after gauge correction, the precipitation estimates of gauge calibrated datasets were more consistent with the gauge observation than the satellite-only datasets. Therefore, the procedure of calibrating SBP products with rain gauge data is the reason for their increased accuracy. Table 5 gives the statistical metrics of errors for the four SBP datasets across three different geographic regions, based on summer monsoon mean values at each station. In lowland areas (below 1500 m) the error metrics indicate that IMERG-UC showed smallest MBs of −0.85 mm/day; indicating the estimated precipitation amount was more consistent with the observed datasets; meanwhile, GSMaP-Gauge showed better overall performance than other three datasets with lower RRMSE (0.45) and a higher R-value (0.56). In the highest precipitating mid-elevation areas (between 1500 and 2500 m), both gauges calibrated datasets showed a more consistent performance to observed datasets than satellite-only datasets. IMERG-C showed the best performance to estimate the precipitation amount with lowest MBs of −3.07 mm/day, while GSMaP-Gauge presented the evident lowest RRMSE and higher R-value, indicating better performance in reproducing the spatial distribution of gauge observed precipitation among all. However, all four datasets underestimated the monsoon precipitation amount below 2500 m.   Table 5. Statistical metrics in the station mean precipitation at different elevation intervals (below 1500 m, between 1500 and 2500 m and above 2500 m), as well as in the whole study region, derived from the four SBP datasets and compared to the gauge observations during the summer monsoon for the study period. Bold font indicates the dataset with the best performance for a given metric. In high-elevation regions (above 2500 m), characterized by complex topography with low precipitation, IMERG-C showed the best performance with smaller errors (MB and RRMSE) and higher R than other datasets, demonstrating that the calibration based on GPCC data significantly improved the IMERG product. Meanwhile, the GSMaP-Gauge product overestimated summer monsoon precipitation and showed very poor correlation with the gauge observation. As mentioned in Section 2.2.1, the GPCC data merged observations from 125 gauges in Nepal, while NOAA/CPC data only merged observations from 54 gauges. Therefore, GPCC data may integrate more precipitation information, especially in high elevation regions where gauge observations are very scarce than NOAA/CPC data. This might be the reason for the improved performance of IMERG-C than GSMaP-Gauge in high-elevation regions.

Regions
Overall, the gauge-calibrated products performed better than the satellite-only products on a monthly scale. IMERG-C yielded smaller MBs than GSMaP-Gauge, while GSMaP-Gauge showed the smaller RRMSE and higher R-value; indicating IMERG-C was more consistent to estimate the precipitation amount than GSMaP-Gauge, whereas GSMaP-Gauge presented more reasonable spatial distribution than IMERG-C. Figure 6 illustrates the spatial distributions of the POD, FAR and ACC values in the four SBP datasets at each station across the country. These values were calculated based on daily precipitation data. POD is above 70% at most of the stations in GSMaP-Gauge, followed by 40-80% in IMERG product, and 40-60% in GSMaP-MVK. Therefore, the daily gauge-calibrated GSMaP-Gauge outperformed the other three datasets to detect gauge precipitation events. It is worth noting that POD is higher in GSMaP-Gauge compared to GSMaP-MVK, while similar performances are found in both IMERG datasets. The GSMaP-Gauge datasets were calibrated based on daily scale NOAA/CPC data; therefore, both the amount and the occurrence were corrected, while IMERG-C datasets were corrected based on monthly scale GPCC data, thus only the precipitation amounts were adjusted. This may be the reason for similar POD performance in two IMERG products. All four datasets showed similar FAR distributions, with the best performance in mid-elevation areas and poor performance in high-elevation areas, revealing that the error was lower when the precipitation amount was higher. ACC exceeded 70% in all four SBP datasets at most of the selected stations across the country ( Figure 6). POD, FAR, and ACC were also calculated for the three different elevation intervals ( Table 6). The daily gauge-calibrated GSMaP-Gauge performed fair well at detecting the gauge observed precipitation events with PODs of 0.73, 0.71 and 0.66 for elevations below 1500 m, 1500-2500 m, and above 2500 m, respectively. Notably, differences in the ACC and FAR scores were nominal (Table 6). In general, all four datasets performed with acceptable scores (ACC higher than 0.70) in detecting both the precipitation and no-precipitation events across the country.  Table 6. Performances of the four SBP datasets expressed by POD, FAR and ACC at each station, averaged over three elevation intervals (<1500 m, 1500-2500 m and > 2500 m), as well as for the whole study region during the study period. A threshold value of 1 mm/day was selected to separate precipitation and no-precipitation events. Bold font indicates the dataset with the best performance for a given metric.

Extreme Precipitation Events
The spatial patterns of extreme precipitation events identified by the four SBP datasets are presented in Figure 7.  Figure (a,c,e,g) shows the accurately detected extreme precipitation events in (%), (b,d,f,h) bias in extreme precipitation events for each station. The black and blue lines denotes the national boundary of the country and 3000 masl elevation contour, respectively.
Extreme precipitation events are defined as those exceeding the 95th percentile (high-intensity extreme) values in gauge-observed datasets from 2015 to 2016 for each station. Extreme events were calculated only for those stations with daily observation data available for more than 90% of the year. Figure 7 shows the extreme events detected by SBP datasets on the same day as those in the gaugeobserved data sets (temporal accuracy) and the mean bias in the total number of extreme events at each station across the country, respectively. GSMaP-Gauge has been moderately improved in contrast with GSMaP-MVK, especially in central Nepal, where more precipitation was observed than in other areas (Figure 7e,g). The spatial distribution of the extreme events suggests that all four SBP datasets have low accuracy and mostly underestimate the frequency of extreme events over the study area (Figure 7b,d,f,h). Figure 8 shows the performances of the four SBP datasets in detecting extreme events within the three elevation intervals. The statistics indicate that all four SBP datasets underestimated the frequency of extreme events in regions below 2500 m, while the IMERG-C and GSMaP-Gauge showed many more fake extreme events than the satellite-only products and thus overestimated the frequency in regions above 2500 m. As shown in Figure 8a, the higher number of extreme events were observed in regions below 2500 m (low and mid-elevation) than that for high elevation regions, and most DHM gauge stations (96.5% of total) are also located in these regions. Therefore, the GPCC analysis dataset interpolated from data of 125 DHM gauge stations and the NOAA/CPC dataset interpolated from data of 54 DHM gauge stations, which were used to calibrate the IMERG and GSMaP products respectively, may present fake high occurrence of extreme events in regions above 2500 m. That is why the calibrated SBP products overestimated the frequency of extreme events in regions above 2500 m. According to Figure 8, IMERG-UC performed much better in presenting the extreme event occurrence than other products especially in regions below 1500 m, suggesting that sometimes the calibration may skew some important signals contained in the satellite-only product. In general, all four SBP products had low accuracy (Figure 8b) and underestimated the frequency of extreme events (Figure 8a) across the country.

Discussion
SBP products provide new alternatives for station observations; however, uncertainties are associated with both gauge observations and SBP estimations. Unfortunately, these uncertainties are difficult to quantify and may have influenced the above results.
Gauge instruments may suffer from systematic biases caused by wind-induced evaporation loss and underestimation of trace values. These biases are more prominent during the winter due to the lower precipitation totals and a higher prevalence of snow. Due to the lack of automatic gauge stations in high-elevation areas, conventional rain gauges measure qualitative precipitation amounts (rainfall + snowfall). Therefore, the evaluation of datasets during the summer is more reliable than during the winter, and the evaluation in this study mainly focuses on the summer monsoon season when the effect of evaporation loss in the observations is not significant due to the large precipitation totals. It is worth noting that the gauge-observed datasets used in this study are not wind corrected, due to the lack of wind speed data for the selected stations, but they were used to evaluate the IMERG-gauge datasets which are calibrated by the wind-loss corrected GPCC gauge analyses. This fact also weakened the certainty of the evaluation results in Section 3. Besides, among the DHM stations used in this study, data from 125 stations were merged to produce the GPCC gauge analysis, which was used to calibrate the GPM-IMERG product. Similarly, data from about 54 stations were also merged to produce the NOAA/CPC gauge-based analysis, which was used to correct the GSMaP product. The overlaps may lead to underestimation of the evaluation errors, which were not identified due to lack of information.
Furthermore, previous studies revealed a quite large variability of precipitation in the high-elevation areas of Nepal with the importance of nocturnal precipitation [43]. In these regions, most of DHM gauge stations are located in valley bottoms [76,77], where the nocturnal precipitation prevails. In such regions, the daily DHM gauge data do not capture the representative precipitation variability and may leave a gap in the performance quantification of SBP products. On the other hand, since the calibrated SBP products are corrected using the gauge-based analysis datasets, which present finer spatial patterns than the natural pattern, especially in mountainous regions. However, the calibrated SBP products may also smoothen the ture spatial pattern in mountainous regions, and even skew some important signals contained in satellite-only products.
Several previous studies also mentioned that topographic nature and regional climate are some dominant factors that influence the precipitation retrieval algorithm used in TRMM and GPM precipitation datasets [78]. The accuracy of the SBP precipitation data depends on various factors, such as regional effects and precipitation intensities. The scatter plot of precipitation rates between gauge observed, and SBP products were drawn using monthly precipitation data during the summer monsoon to quantify the performance of the SBP datasets for different precipitation intensities ( Figure 9).
GSMaP-Gauge, IMERG-C, IMERG-UC and GSMaP-MVK performed better for low precipitation rates (<10 mm/day) than for high precipitation rates (>10 mm/day), indicating that the SBP datasets have difficulty in estimating heavy precipitation. These SBP datasets also overestimated the amount of light rain and underestimated the amount of heavy rain. Such underestimation of heavy rain could be the reason for underestimated high-intensity related gauge observed extreme events (Figures 7b and 8) across the country. These discrepancies are primarily related to false precipitation in the form of light rain or solid precipitation and underestimation of heavy precipitation, respectively. Also, complex physiographic nature of the study region may have an effect on the upward microwave radiation, which makes difficult for the satellite to resolve precipitation over areas with low precipitation amount, especially in the mountainous region [79,80].
SBP datasets are indirect measurements that are based on satellite/sensor constellations, including PMW and IR sensors onboard LEO and geostationary satellites. These datasets may not accurately detect precipitation in high-elevation areas [81], especially in the winter season, when the ground surface is covered with snow and ice [13]. The errors of precipitation estimated by PMW are mainly based on scattering signal which cannot catch up warm/low-level precipitation frequently occurs in low-elevation areas and algorithm to interpolate finer time-scales, such as using cloud motion vector by IR. Meanwhile, the used algorithm can not capture to interpolate finer time-scales, such as using cloud motion vector by IR. Figure 9. Scatter plot of differences in precipitation rates between gauge observed and SBP products, derived from monthly mean precipitation averaged over the monsoon season (mm/day). All the units of statistical metrics are in mm/day. Black, blue, and red colors indicate the performance statistics for precipitation rates less than 10 mm/day, between 10 and 20 mm/day, and above 20 mm/day, respectively. The continuous black and dotted black line represents the linear regression and 1:1 line, respectively.
To reduce such bias, satellite-only estimates were calibrated using gauge-based GPCC and CPC datasets. The performances of the IMERG-C and GSMaP-Gauge datasets were also influenced by the quality and temporal range of the calibrated gauge-based GPCC and CPC datasets, respectively. Meanwhile, satellite-only datasets are only effectively adjusted for those areas where gauge data are available. Our results showed a substantial improvement in gauge-calibrated SBP datasets, which are more consistent than satellite-only datasets, due to the advantages of observed gauge adjustments. This result is similar to studies conducted in Central Asia [82], China [64,83], East Africa [84] and Ethiopia [85]. However, deterioration of IMERG-C in low-elevation areas (Figures 3 and 7 and Table 5) as compared to IMERG-UC, as well as deterioration of GSMaP-Gauge in high-elevation areas (Figure 8c and Table 5) as compared to GSMaP-MVK, might be related to limitations of adjusted relevant rain-gauge density (GPCC and CPC). Such discrepancies indicate that the IMERG-C and GSMaP-Gauge retrieval algorithms need further improvements, particularly for mountainous areas, such as Nepal. Additionally, both PMW and IR satellites have complication in detecting shallow orographic precipitation [61,86,87]. We found that the local weather conditions and nature of the topography also influence the rainfall capturing capacity of SBP product.

Conclusions
This study attempted to evaluate the latest four SBP products in the southern slope of central Himalayas, Nepal and compare both satellite-only (IMERG-UC and GSMaP-MVK) and gaugecalibrated (IMERG-C and GSMaP-Gauge) products for their accuracy and discrepancies with 388 gauges measurements from March 2014 to December 2016. Conventional statistical metrics and categorical scores were used to quantify the performances of these SBP products.
Precipitation estimates differ widely between SBP products, depending on the season and location. The GSMaP-Gauge dataset was more consistent at representing the spatial pattern of observed precipitation followed by IMERG-UC, GSMaP-MVK, and IMERG-C. However, all four datasets can capture the seasonal precipitation dynamics across the country. Among them, IMERG-C and GSMaP-Gauge presented more consistent seasonal dynamics range (in terms of RRMSE and R) with the gauge observations than the satellite-only datasets. Even though all four SBP products underestimate the gauge-observed precipitation across Nepal; both gauge-calibrated SBP datasets performed better (lower RRMSE, higher R) than the satellite-only datasets. IMERG-C and GSMaP-Gauge showed similar errors (MB and RRMSE) in Nepal, although both had discrepancies in capturing the precipitation patterns. For instance, GSMaP-Gauge presented a more reasonable spatial distribution, while IMERG-C moderately reproduced the evident elevation-dependent pattern of precipitation as revealed by gauge observations, i.e., increasing precipitation with an increasing elevation below 2000 m and then decreasing above 2000 m.
When selecting 1 mm/day as the threshold defining a daily rainfall event, benefit from merging daily gauge-based NOAA/CPC analysis data, GSMaP-Gauge performed best (with higher POD) for detecting gauge observed precipitation events among four datasets. Gauge observations indicated that more high-intensity precipitation extreme events (95th percentile) occur in regions with an elevation below 2500 m. All four SBP datasets underestimated the total frequency of extreme precipitation events across the country. It is worth noting that IMERG-UC performed much better in presenting the occurrence of extreme events than other products especially in regions below 1500 m, suggesting that sometimes the calibration may skew some important signals contained in the satellite-only product.
The present work addresses the lack of systematic evaluation of the latest two SBP products in the southern slope of central Himalayas, Nepal. This evaluation provides a statistical basis and allows rigorous data selection in meteorological, hydrological, glaciological, and disaster-related studies within the study region. We recommend that further evaluation of SBP products based on the weather characteristics over the complex terrain may provide useful information to algorithm developers and data users.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/11/1836/s1, Table S1: Location and data availability of selected rain gauge station.  Acknowledgments: The authors are thankful to the scientists at the NASA and JAXA, who were responsible for the development of the IMERG and GSMaP products. The DHM, Evk2-CNR committee (Pyramid station) is also acknowledged for providing the rain-gauge precipitation datasets. The authors would like to thank three anonymous reviewers for their insightful comments and suggestion, which helped to improve the manuscript.

Conflicts of Interest:
There is no conflict of interest among the authors.  Figure A1. (a) The timeseries of monthly mean precipitation and (b) Bias between observed and four SBP datasets averaged over the study area.