Assessment of GPM-Era Satellite Products’ (IMERG and GSMaP) Ability to Detect Precipitation Extremes over Mountainous Country Nepal

: The reliability of satellite precipitation products is important in climatic and hydro-meteorological studies, which is especially true in mountainous regions because of the lack of observations in these areas. Two recent satellite rainfall estimates (SREs) from Global Precipitation Measurement (GPM)-era—Integrated Multi-Satellite Retrievals for Global Precipitation Measurement (IMERG-V06) and gauge calibrated Global Satellite Mapping of Precipitation (GSMaP-V07) are evaluated for their spatiotemporal accuracy and ability to capture extreme precipitation events using 279 gauge stations from southern slope of central Himalaya, Nepal, between 2014 and 2019. The overall result suggests that both SREs can capture the spatiotemporal precipitation variability, although they both underestimated the observed precipitation amount. Between the two, the IMERG product shows a more consistent performance with a higher correlation coefﬁcient (0.52) and smaller bias ( − 2.49 mm/day) than the GSMaP product. It is worth mentioning that the monthly gauge-calibrated IMERG product yields better detection capability (higher probability of detection (POD) values) of daily precipitation events than the daily gauge calibrated GSMaP product; however, they both show similar performance in terms of false alarm ratio (FAR) and critical success index (CSI). Assessment based on extreme precipitation indices revealed that the IMERG product outperforms GSMaP in capturing daily precipitation extremes (RX1Day and RX5Day). In contrast, the GSMaP product tends to be more consistent in capturing the duration and threshold-based precipitation extremes (consecutive dry days (CDD), consecutive wet days (CWD), number of heavy precipitation days (R10mm), and number of extreme precipitation days (R25mm)). Therefore, it is suggested that the IMERG product can be a good alternative for monitoring daily extremes; meanwhile, GSMaP could be a better option for duration-based extremes in the mountainous region. (RX1day GSMaP-Gauge on other hand performance on the total count of and extreme precipitation events (R10mm and R25mm), including consecutive and wet days (CDD and CWD) detection. both SREs well the extreme precipitation event consecutive days, is also similar to the study of [23].


Introduction
Precipitation is the result of the complex interaction between various atmospheric components at multiple levels and scales [1]. It is highly variable in both space and time because of its discrete nature. As a primary driver of the hydrological cycle, its measurement and estimates play a crucial role in water resource management and climate change adaptation strategy [2]. In the context of global warming and rising precipitation extremes, accurate measurement of precipitation is crucial for understanding its change and variability over space and time. Therefore, high resolution gridded precipitation datasets (DPR) [28,42]. GPM mission was launched jointly in 2014 by the National Aeronautics and Space Administration (NASA) and the Japanese Aerospace Exploration Agency (JAXA). After the GPM mission, NASA introduced a new SRE product-Integrated Multi-satellite Retrievals for GPM (IMERG) that combines the data from multiple satellite sensors [28]; meanwhile, JAXA updated to a newer version of GSMaP (i.e., GSMaP-Version 7) with newer orographic rainfall correction scheme [43]. Although these SREs are available at improved spatial and temporal resolutions, topographical variation highly influences its overall performance [44] and are faced with multiple challenges, particularly over the mountainous area [34]. These challenges are more prominent over Nepal, as the east-west extended mountain range often provides rise to warm orographic rain while IR and PMW sensors have difficulty capturing this precipitation [33,34,36,37]. Furthermore, the inability of PMW sensors in differentiating between frozen hydrometers and surface snow also creates uncertainty in SREs [33,35]. However, to improve accuracy, several gauge-based precipitation products are used to calibrate SREs [45]. Nevertheless, these SREs are indirect measurement tools and need to be verified and calibrated with gauge observation before further application [37,46].
Several studies have evaluated SREs for different hydro-meteorological applications globally [2,3,18,20,23,43,[47][48][49][50][51][52]. A study by Liu et al. [47] found that the IMERG monthly product captured the heavy precipitation regions in both hemispheres. Similarly, Prakash et al. [50] reported that the IMERG product well represented the mean monsoon rainfall and its variability more realistically than the gauge-adjusted GSMaP over India. Statistical comparison to Yellow River Basin, China found that the GSMaP-Gauge performed better at the daily time scale, while IMERG performed better at the monthly scale [49]. A comprehensive evaluation of multiple precipitation products (SREs and reanalysis) in China, revealed that IMERG outperformed other datasets except for GSMaP [51]. Furthermore, only a few studies have evaluated GPM-era SREs focusing on complex terrain areas such as Ethiopia, Italy [53], and India [18,48,54], revealing that GSMaP-V07 and IMERG-V05 tend to underestimate rainfall over complex mountainous areas. Similarly, the GSMaP-V07 product surpasses IMERG-V06B over the multiple complex regions of the world [31].
Apart from global study, only a few studies have evaluated GPM-era SREs over the most complex region, Nepal. For example, Sunil Kumar et al. [32] for the first time evaluated the GPM-IMERG-V05 precipitation product with the Asian Precipitation-Highly Resolved Observational Data Integration Towards Evaluation (APHRODITE-2 V1801R1) over the Asian region, including Nepal, and noted that IMERG underestimated the magnitudes of rainfall during the wet season over Nepal. Recently, Sharma et al. [48] evaluated the GPM-IMERG and GSMaP products focusing on Nepal and revealed that the gauge-calibrated IMERG performed better in estimating precipitation amount, while gaugecalibrated GSMaP shows better spatial relevancy with gauge observation.
Most of the above-mentioned global and regional studies suggested that continuous evaluation of SREs, especially over the complex topography, is still needed to further advance the product algorithms [34,43,53,55,56]. Sharma et al. [18] evaluated the TRMM Multi-satellite Precipitation Analysis (TMPA) and IMERG for monitoring precipitation extremes using only four extreme indices and found that both products can capture precipitation extremes (high-intensity and drought) over Nepal. Moreover, most of the earlier studies had considered limited stations for performance evaluation in Nepal. Additionally, none of the studies has conducted a comprehensive assessment of SREs to estimate extreme precipitation using Expert Team for Climate Change Detection Monitoring and Indices (ETCCDI). Thus, the main objective of this study is to evaluate the performance of two prominent SREs from GPM-era; gauge-calibrated GSMaP-V07 and gauge calibrated IMERG-V06 products using a large-scale gauge network (i.e., 279 stations) over the entire Himalayan country, Nepal during 2014-2019. More specifically, we aim to evaluate the spatiotemporal performance of these SREs based on (a) various descriptive statistics and (b) six different extreme precipitation indices.  (Figure 1). Lying in the southern slope of the central Himalayas, topography varies from~59 m in the southern lowlands to 8848.86 m (i.e., Mt. Everest) in the northern Himalayan region. The diverse topography creates a unique climate condition that varies from tropical, subtropical in the southern lowland areas to polar and tundra in the northern high mountain region [57]. Based on modified Köppen Geiger climatic classification, Karki et al. [57] identified eight different (Aw, BSk, Cwa, Cwb, Dwb, Dwc, ET, and EF) climate types in the country (Figure 1b). Among them, Cwa over the southern lowlands (<400 m elevation) covers 30% of the total area, whereas Dwc over the northern higher mountainous area has the lowest area coverage of 0.5%, which is mostly characterized by ET (polar tundra) climate. Furthermore, the complex atmospheric interaction with rugged terrain leads to heterogeneous weather and climatic condition across Nepal. South Asian summer monsoon (SASM) and westerlies dominate the seasonal variability with maximum rainfall during the summer monsoon season (June -September, JJAS) with~80% of annual precipitation followed by pre-monsoon (March-May, MAM, 12.5%), post-monsoon (October-November, ON, 4.0%), and winter (December-February, DJF, 3.5%) [7,58]. Monsoon enters Nepal from the east and advects toward the west, covering the whole country within a week. Generally, summer monsoon is characterized by widespread rainfall [7,59], whereas, pre-monsoon season is characterized by localized afternoon thundershowers [7]. Furthermore, post-monsoon and winter seasons are relatively dry seasons. Winter rainfall is more pronounced in the western part of the country and contributes to snowfall in the high Himalayan regions [60,61].

Manual Rain Gauge Data
The daily gauge observed precipitation data of 333 manual gauging stations from 2014 to 2019 were obtained from the Department of Hydrology and Meteorology (DHM), Nepal (www.dhm.gov.np (accessed on 5 August 2020)). DHM uses US standard, 8-inch diameter rain gauge throughout the country to maintain instrumental consistency in precipitation measurement. Furthermore, those manual stations report past 24-h precipitation accumulation at every 0300 UTC (end of the day (EOD)) of the observation day [62]. Currently, there are nearly 481 gauging stations (including manual and automatic) operated by the DHM over the country; however, all stations do not feature regular observation, especially at high-elevation mountainous regions. In addition to the data gap in the high mountainous region, some stations located in the southern low lands also suffers from frequent missing and poor-quality data. Though DHM applies different quality control (QC) measures in its raw manual data, the RClimDex toolkit [63] was used to conduct further QC of collected data, which tests for the outliers and missing values in the data. Stations with more than three days of missing data in a month and more than 15 days of missing data in a year were excluded from the analysis [64]. A total of 54 stations out of 333 stations having missing data, which were then excluded, and only 279 stations were used further in this study ( Figure 1a).

IMERG Product
Among available various GPM products, the IMERG dataset is used in this study. IMERG is the product generated after inter-calibration, merging, and interpolating precipitation estimates from all passive microwave sensors in the GPM constellation satellites [28]. Based on release time, early run (~4-h latency), late run (~12-h latency), and final run (2.5 months latency) are the three products of IMERG. Among them, the final run uses monthly gauge data to create a research-level product, and the other two products are targeted for the short fuse application like flood monitoring to daily scale crop forecasting, including water management [65]. IMERG precipitation estimates are computed by the Goddard Profiling algorithm (GPROF-2017) using microwave measurements from passive microwave sensors, then calibrated against the GPM microwave-radar estimates and merged into half-hourly estimates. These estimates are again recalibrated with CMORPH Kalman Filter (CMORPH-KF) and PERSIANN Cloud Classification System, after which bias correction with monthly GPCP rain gauge measurements is performed to produce the final IMERG precipitation product [28]. Version 6 of the product is available from June 2000 to the present at 0.1 • × 0.1 • resolution. Sharma et al. [48] suggested that the precipita-tionCal dataset embedded in IMERG performs better over Nepal against precipitationUcal; thus, IMERG-V06 precipitationCal (hereafter, IMERG) from 13 March 2014 to 31 December 2019 is considered in this study.

GSMaP Product
GSMaP is the project under JAXA, initiated by Japan Science and Technology Agency under the Core Research for Evolutional Science and Technology (CREST) program in 2002 [66][67][68]. When GPM mission was launched, GSMaP updated their algorithm to produce improved products in 2014 with algorithm version V6 and again updated the algo-rithm to version V7 in 2017 known as GPM-era GSMaP products and includes "standard," "near-real-time", "real-time", and "reanalysis" products at 0.1 • × 0.1 • grid resolution with hourly temporal resolution [68]. In this version of the algorithm, an updated orographic and non-orographic rainfall classification scheme has been implemented for all the sensors to overcome overestimation and false alarms on heavy orographic precipitation [69]. The standard version of GSMaP products processed by Japan Meteorological Agency (JMA) Global Analysis (GANAL) data comprises of two products-GSMaP-MVK (satellite-only) and GSMaP-Gauge (gauge-corrected) both updated at an hourly interval. Global precipitation rates retrieved from brightness temperatures (BTs) of multiple passive microwave sensors (PMWs) is propagated through forward and backward morphing technique based on atmospheric moving vector derived from 2 IR images, which are then refined by new Kalman filter model to produce GSMaP-MVK products [48,68,70]. In addition, GSMaP-MVK precipitation estimates over land are corrected using NOAA/Climate Prediction Center (CPC) unified gauge based on daily precipitation analysis at 0.5 • grid resolution globally [68]. For this study, GSMaP-Gauge is used for further analysis as gauge calibrated datasets are able to represent a seasonal dynamic range of precipitation, and GSMaP can capture the spatial pattern of rainfall over Nepal [48]. The overlapping time period from 13 March 2014 to 31 December 2019 is considered for the GSMaP dataset. Table 1 below has summarized the dataset information used in this study. In order to evaluate the performance of SRE in a reference to gauge observation over the complex terrain of Nepal, the common time period (i.e., 13 March 2014 to 31 Dec 2019) was considered. Between the two gridded SRE products, IMERG has a temporal resolution of 30 min and GSMaP-Gauge is available at the hourly interval, whereas the gauge observation is only available on a daily time scale. Thus, in order to maintain temporal consistency between datasets, both SREs were aggregated into a daily scale such that it overlaps with the 24-h accumulation period (0300-0300 UTC) of gauge observed data, which ends at 0300 UTC (0845 NPT) of the day of measurement [62]. Those aggregated gridded SREs data were then extracted to the station location using the station-to-grid method [23,48] to overcome further error accumulation by interpolating gauge observed data to SRE spatial grid resolution [48,[71][72][73]. Additionally, if the station contains any missing data and then the corresponding SRE values for that day are also replaced by the missing value [48] to maintain consistency over the time-series. Finally, mean annual precipitation from SRE and gauge observations were evaluated based on visual inspection [74]. Mean monthly precipitation was evaluated using a time-series plot, seasonal precipitation was assessed using the Taylor diagram, which summarizes how well SRE matches the gauge observation and statistical metrics were computed from daily data at each station to evaluate spatially. The analytical workflow followed in this study is presented in Figure 2.
itation was assessed using the Taylor diagram, which summarizes how well SRE m the gauge observation and statistical metrics were computed from daily data at ea tion to evaluate spatially. The analytical workflow followed in this study is presen Figure 2.

Statistical Evaluation Metrics
Following the previous studies [23,32,43,51,75], several pairwise statistical m were selected to quantify the similarities and discrepancies between gauge-observe

Statistical Evaluation Metrics
Following the previous studies [23,32,43,51,75], several pairwise statistical metrics were selected to quantify the similarities and discrepancies between gauge-observed and SREs (Table 2). Among them, Pearson's correlation coefficient (CC) detects the degree of linear agreement between SRE and gauge observed precipitation, relative bias (RB) assesses the systematic bias in SRE. Root square mean error (RMSE) measures the average error magnitude in SRE, and Kling Gupta efficiency (KGE) [76] assesses the overall goodness-offit between SRE and gauge-observed precipitation.
Four different categorical indices-the probability of detection (POD), false alarm ratio (FAR), critical success index (CSI), and categorical frequency bias index (FBI) were considered to evaluate the detection capability of SREs. Seven threshold (0.5, 1, 2, 4, 6, 8, and 10 mm/d) values were selected to assess the SREs performance for different precipitation rate [23]. These skill indices are calculated based on a 2 × 2 contingency table (Table 3) of rain/no-rain events. In Table 3, Q1 represents correctly estimated no-rain events, Q2/Q3 represents a false estimation of rain/no-rain event by SRE where there is no-rain/rain event in gauge observed data, and Q4 refers to correctly estimated rain events by SRE. Skill indices POD, also known as hit-rate, measures the fraction of the correctly diagnosed gauge observed events, FAR provides the fraction of diagnosed events that did not occur, CSI measures how well SRE corresponds to gauge observed data and FBI compares the frequency of SRE events to gauge observed events.

Statistical Index Equations Perfect Value
Pearson correlation coefficient Where n = number of samples; S = satellite-based precipitation (IMERG, GSMaP-Gauge); G = gauge observation; r = correlation coefficient; α = standard deviation; µ = mean; Q2 = false alarms; Q3 = misses; and Q4 = hits.  [7,23,64,77] has proposed eleven indices for precipitation. These indices are widely used to detect precipitation extremes in the context of rising global warming [7,23] and are classified into absolute, threshold, duration, and percentile indices based on computation method [7,77,78]. In our study, we have considered the indices that are useful indicators of floods and droughts. For instance, among the selected indices, RX1day and RX5day characterize the magnitude of extreme precipitation events that can trigger landslides and flash floods. Similarly, R25mm and R10mm provides the frequency of heavy to very heavy precipitation events. Additionally, CDD and CWD indirectly assess the droughts condition, which is vital for agricultural sectors [7]. Further, a comparison of those indices between the gauge observed and SRE was performed (Table 4).

. Spatial Distribution of Mean Annual Precipitation in Nepal
The spatial distribution of mean annual precipitation computed from gauge observation and SREs in Nepal between 2014 and 2019 is presented in Figure 3. It is evident that the highest precipitation area over Nepal is located between the extent 28.3 • N-28.5 • N and 83.8 • E-84 • E (around Lumle on the windward side of Mt. Annapurna, Figure 1a) with >10 mm/d, and the lowest mean annual precipitation (<2 mm/d) is located in the northern side of Mt. Annapurna (leeward) (Figure 3a). Such precipitation gradient is mainly due to the existence of Mt. Annapurna's mountain range in between, which acts as a barrier to let the moisture advection further north. The orographic response of mountain plays a dominant role in creating such precipitation gradient [79]. Both SREs well reflect the heterogeneous precipitation characteristics over Nepal (Figure 3b,c), exhibiting remarkable spatial consistency in precipitation patterns; however, there is a marked variation in the amount and maximum precipitation areas. GSMaP-Gauge shows a better ability to quantitatively detect such high precipitation pocket areas (>10 mm/d) (Figure 3c), whereas IMERG completely misses the heavy precipitation region (Figure 3b). Since GSMaP-Gauge applies an orographic/non-orographic precipitation classification scheme [69] it outperforms IMERG in such areas of orographic precipitation. Although the orographic/non-orographic precipitation classification scheme is not applied in the IMERG algorithm [80], it shows heavy precipitation around 27.7 • N, 84.5 • E area (Figure 3b), which is not present in GSMaP-Gauge ( Figure 3c). Moreover, both SREs captured the north-south precipitation variability with relatively low precipitation (<2 mm/d) areas in the leeward sides along with the high mountain range and high precipitation (>10 mm/d) areas in the windward side around the southern part of the country. Overall, both SREs underestimated the gauge observed mean annual precipitation during the study period.

Monthly Precipitation Distribution
The mean monthly precipitation cycle and time-series plots during the study period are presented in Figure 4. As the GSMaP-Gauge is available from March 2014; the monthly cycle is only presented from 2015 to 2019. It is evident that the months from June to September (monsoon season) received high precipitation with a peak in July (15.53 mm/d), whereas October to November is the driest season with November being the driest (0.073 mm/d) (Figure 4a). IMERG and GSMaP-Gauge closely follow the monthly precipitation cycle of gauge observation with a peak in July; however, they both underestimated the precipitation, especially during the monsoon months (July-August) (Figure 4a). It is worthy to note that underestimation during July and August is higher in GSMaP-Gauge than IMERG product; however, both products can capture the monthly precipitation cycle as revealed by observation.
Atmosphere 2021, 12, x FOR PEER REVIEW 11 of 23 areas in the windward side around the southern part of the country. Overall, both SREs underestimated the gauge observed mean annual precipitation during the study period.

Monthly Precipitation Distribution
The mean monthly precipitation cycle and time-series plots during the study period are presented in Figure 4. As the GSMaP-Gauge is available from March 2014; the monthly cycle is only presented from 2015 to 2019. It is evident that the months from June to September (monsoon season) received high precipitation with a peak in July (15.53 mm/d), whereas October to November is the driest season with November being the driest (0.073 mm/d) (Figure 4a). IMERG and GSMaP-Gauge closely follow the monthly precipitation cycle of gauge observation with a peak in July; however, they both underestimated the precipitation, especially during the monsoon months (July-August) (Figure 4a). It is worthy to note that underestimation during July and August is higher in GSMaP-Gauge than IMERG product; however, both products can capture the monthly precipitation cycle as revealed by observation.
Furthermore, 2016 (2015) is observed as the wettest (driest) year among the study period. Both SREs well captured the monthly precipitation variability with higher underestimation during the wet months. Some unique feature in monthly mean precipitation distribution during 2016 and 2019 monsoon is observed where there is a sudden drop in the precipitation peak from July to August and then rises in September, and both SREs also captured a similar pattern with slight underestimation (Figure 4b). Taylor diagram in Figure 5 shows CC, centered RMSE, and standard deviation (SD) for daily and seasonal timescale between gauge observed and SREs. Mean seasonal precipitation from spatially averaged daily precipitation was computed for both gauge observation and SREs then SD (blue dotted arc), CC (a black dotted line from the origin),

Monthly Precipitation Distribution
The mean monthly precipitation cycle and time-series plots during the study period are presented in Figure 4. As the GSMaP-Gauge is available from March 2014; the monthly cycle is only presented from 2015 to 2019. It is evident that the months from June to September (monsoon season) received high precipitation with a peak in July (15.53 mm/d), whereas October to November is the driest season with November being the driest (0.073 mm/d) (Figure 4a). IMERG and GSMaP-Gauge closely follow the monthly precipitation cycle of gauge observation with a peak in July; however, they both underestimated the precipitation, especially during the monsoon months (July-August) (Figure 4a). It is worthy to note that underestimation during July and August is higher in GSMaP-Gauge than IMERG product; however, both products can capture the monthly precipitation cycle as revealed by observation.
Furthermore, 2016 (2015) is observed as the wettest (driest) year among the study period. Both SREs well captured the monthly precipitation variability with higher underestimation during the wet months. Some unique feature in monthly mean precipitation distribution during 2016 and 2019 monsoon is observed where there is a sudden drop in the precipitation peak from July to August and then rises in September, and both SREs also captured a similar pattern with slight underestimation (Figure 4b). Taylor diagram in Figure 5 shows CC, centered RMSE, and standard deviation (SD) for daily and seasonal timescale between gauge observed and SREs. Mean seasonal precipitation from spatially averaged daily precipitation was computed for both gauge observation and SREs then SD (blue dotted arc), CC (a black dotted line from the origin), Furthermore, 2016 (2015) is observed as the wettest (driest) year among the study period. Both SREs well captured the monthly precipitation variability with higher underestimation during the wet months. Some unique feature in monthly mean precipitation distribution during 2016 and 2019 monsoon is observed where there is a sudden drop in the precipitation peak from July to August and then rises in September, and both SREs also captured a similar pattern with slight underestimation (Figure 4b).
Taylor diagram in Figure 5 shows CC, centered RMSE, and standard deviation (SD) for daily and seasonal timescale between gauge observed and SREs. Mean seasonal precipitation from spatially averaged daily precipitation was computed for both gauge observation and SREs then SD (blue dotted arc), CC (a black dotted line from the origin), and centered RMSE (continuous blue arc line) was calculated and plotted in Taylor Diagram. Centered RMSE value ranges from 0 to 1 and has a similar statistical meaning as of RMSE, and the performance of SRE is considered best if it lies closer to the black dot of gauge observation. IMERG outperformed GSMaP-Gauge at both daily and seasonal scales. At daily timescale, IMERG shows a higher correlation (>0.9) and lower RMSE (<0. scales. At daily timescale, IMERG shows a higher correlation (>0.9) and low (<0.5). Similarly, at seasonal scale IMERG achieves greater CC (DJF = 0.95, MA JJAS = 0.88, ON = 0.92) than GSMaP-Gauge (DJF = 0.92, MAM = 0.84, JJAS = 0 0.90) with gauge observation. During the wet season, IMERG showed low RMSE GSMaP-Gauge (RMSE > 0.5); however, in the drier season (ON), GSMaP-Gaug smaller RMSE (< 0.5) than IMERG (0.6).

Spatial Distribution of Statistical Scores
The spatial distribution of statistical scores based on a daily timescale at ea is presented in Figure 6. Both products show a higher CC in the southern and mi regions and a low CC over the northern high mountainous region. SREs limita the mountainous region to capture precipitation [81] can be attributed to low northern high mountain areas. Different performance between IMERG and Gauge is observed over a high precipitation zone, i.e., central Nepal (similar 3.1.1). Further, GSMaP-Gauge outperformed IMERG in terms of CC ( Figure 6a); they both underestimated the gauge observed precipitation as indicated by neg in Figure 6c,d. Interestingly, both products showed very similar RMSE distr most of the stations during the study period. Among two SREs, IMERG show KGE value (>0.5) in most of the stations lying in southern Nepal, indicating IME performance in the southern flat region (Figure 6g). Furthermore, in the complex side, IMERG performed well with the comparatively low number of stations ha ative KGE than GSMaP-Gauge (Figure 6g,h). Interestingly in the leeward side o napurna GSMaP-Gague performed relatively well with a KGE value >0 com IMERG.

Spatial Distribution of Statistical Scores
The spatial distribution of statistical scores based on a daily timescale at each station is presented in Figure 6. Both products show a higher CC in the southern and middle hilly regions and a low CC over the northern high mountainous region. SREs limitations over the mountainous region to capture precipitation [81] can be attributed to low CC over northern high mountain areas. Different performance between IMERG and GSMaP-Gauge is observed over a high precipitation zone, i.e., central Nepal (similar to Section 3.1.1). Further, GSMaP-Gauge outperformed IMERG in terms of CC ( Figure 6a); however, they both underestimated the gauge observed precipitation as indicated by negative bias in Figure 6c,d. Interestingly, both products showed very similar RMSE distribution at most of the stations during the study period. Among two SREs, IMERG shows a higher KGE value (>0.5) in most of the stations lying in southern Nepal, indicating IMERG better performance in the southern flat region (Figure 6g). Furthermore, in the complex northern side, IMERG performed well with the comparatively low number of stations having negative KGE than GSMaP-Gauge (Figure 6g,h). Interestingly in the leeward side of Mt. Annapurna GSMaP-Gague performed relatively well with a KGE value >0 compared to IMERG.
In addition, SREs performance against gauge observation at daily scale was assessed based on POD, FAR, CSI, and FBI calculated at different thresholds ranging from 0.5 mm/d to 10 mm/d. Both SREs can capture light precipitation, the declining curve with an increasing threshold indicates a decline in SREs performance with increasing precipitation amount (Figure 7). In terms of POD and CSI at high rainfall intensity, IMERG performed better than GSMaP-Gauge. Similarly, in terms of FBI, IMERG outperformed GSMaP-Gauge with slightly smaller underestimation for heavy rainfall; however, for rainfall less thañ 3 mm/d GSMaP is better. Overall, IMERG is more capable of detecting heavy precipitation than GSMaP-Gauge.
x FOR PEER REVIEW 13 of 23 In addition, SREs performance against gauge observation at daily scale was assessed based on POD, FAR, CSI, and FBI calculated at different thresholds ranging from 0.5 mm/d to 10 mm/d. Both SREs can capture light precipitation, the declining curve with an Gauge with slightly smaller underestimation for heavy rainfall; however, for rainfall less than ~3 mm/d GSMaP is better. Overall, IMERG is more capable of detecting heavy precipitation than GSMaP-Gauge.

Spatial Distribution of Extreme Precipitation Indices
Precipitation extremes are the major triggering factor for meteorological disasters, including flash floods to meteorological drought globally. Identification of such extremes and monitoring plays a crucial role in controlling economic and human casualties. Figure  8 presents the CC between six extreme precipitation indices from SREs and gauge observations. IMERG demonstrated better detection capacity on capturing daily extremes with higher CC (0.37 and 0.53) for absolute indices (RX1day and RX5day), whereas GSMaP-Gauge on the other hand shows better performance on the total count of heavy and extreme precipitation events (R10mm and R25mm), including consecutive dry and wet days (CDD and CWD) detection. Overall, both SREs well captured the extreme precipitation event than consecutive precipitation days, which is also similar to the study of [23]. Furthermore, the spatial distribution of CDD and CWD spells in Nepal is presented in Figure 9. A higher number of CDD spell is observed in the southern lower flat region of eastern, central, and some areas of the western region than in northern areas ( Figure  9a). IMERG product underestimated the total frequency of CDD spells with higher negative bias (−20.7 %) and RMSE (124.43) (Figure 9b). In contrast, GSMaP-Gauge overestimated the total frequency of CDD spells. Similar to mean precipitation distribution in Figure 3a, the peak precipitation region has higher total CWD spells (>100) while lower CWD

Spatial Distribution of Extreme Precipitation Indices
Precipitation extremes are the major triggering factor for meteorological disasters, including flash floods to meteorological drought globally. Identification of such extremes and monitoring plays a crucial role in controlling economic and human casualties. Figure 8 presents the CC between six extreme precipitation indices from SREs and gauge observations. IMERG demonstrated better detection capacity on capturing daily extremes with higher CC (0.37 and 0.53) for absolute indices (RX1day and RX5day), whereas GSMaP-Gauge on the other hand shows better performance on the total count of heavy and extreme precipitation events (R10mm and R25mm), including consecutive dry and wet days (CDD and CWD) detection. Overall, both SREs well captured the extreme precipitation event than consecutive precipitation days, which is also similar to the study of [23].

Spatial Distribution of Extreme Precipitation Indices
Precipitation extremes are the major triggering factor for meteorological disasters, including flash floods to meteorological drought globally. Identification of such extremes and monitoring plays a crucial role in controlling economic and human casualties. Figure  8 presents the CC between six extreme precipitation indices from SREs and gauge observations. IMERG demonstrated better detection capacity on capturing daily extremes with higher CC (0.37 and 0.53) for absolute indices (RX1day and RX5day), whereas GSMaP-Gauge on the other hand shows better performance on the total count of heavy and extreme precipitation events (R10mm and R25mm), including consecutive dry and wet days (CDD and CWD) detection. Overall, both SREs well captured the extreme precipitation event than consecutive precipitation days, which is also similar to the study of [23]. Furthermore, the spatial distribution of CDD and CWD spells in Nepal is presented in Figure 9. A higher number of CDD spell is observed in the southern lower flat region of eastern, central, and some areas of the western region than in northern areas ( Figure  9a). IMERG product underestimated the total frequency of CDD spells with higher negative bias (−20.7 %) and RMSE (124.43) (Figure 9b). In contrast, GSMaP-Gauge overestimated the total frequency of CDD spells. Similar to mean precipitation distribution in Figure 3a, the peak precipitation region has higher total CWD spells (>100) while lower CWD Furthermore, the spatial distribution of CDD and CWD spells in Nepal is presented in Figure 9. A higher number of CDD spell is observed in the southern lower flat region of eastern, central, and some areas of the western region than in northern areas (Figure 9a). IMERG product underestimated the total frequency of CDD spells with higher negative bias (−20.7%) and RMSE (124.43) (Figure 9b). In contrast, GSMaP-Gauge overestimated the total frequency of CDD spells. Similar to mean precipitation distribution in Figure 3a, the peak precipitation region has higher total CWD spells (>100) while lower CWD spells (<100) in mid-elevation areas and lowest (<50) over southern plain areas of the country. Moreover, GSMaP-Gauge shows better spatial distribution (higher correlation) than IMERG product; however, IMERG product achieves smaller bias than GSMaP-Gauge for CWD spells. spells (<100) in mid-elevation areas and lowest (<50) over southern plain areas of the country. Moreover, GSMaP-Gauge shows better spatial distribution (higher correlation) than IMERG product; however, IMERG product achieves smaller bias than GSMaP-Gauge for CWD spells. SREs' ability to capture heavy (R10mm) and extreme precipitation events (R25mm) at each station during the study period is shown in Figure 10. R10mm and R25mm are the count of rainy days above 10 mm/day and 25 mm/day, respectively. A small number of R10mm and R25mm (<100) events are observed over the station located at northern high elevation and southern flat areas. The higher frequencies of both indices (>150 and >300 events) are concentrated along with the middle elevation areas (Figure 10a,d). IMERG and GSMaP-Gauge underestimated the frequency of R25mm with a negative bias of -31.9% and −41.4%, respectively. Further, such underestimation was more prominent over the SREs' ability to capture heavy (R10mm) and extreme precipitation events (R25mm) at each station during the study period is shown in Figure 10. R10mm and R25mm are the count of rainy days above 10 mm/day and 25 mm/day, respectively. A small number of R10mm and R25mm (<100) events are observed over the station located at northern high elevation and southern flat areas. The higher frequencies of both indices (>150 and >300 events) are concentrated along with the middle elevation areas (Figure 10a,d). IMERG and GSMaP-Gauge underestimated the frequency of R25mm with a negative bias of −31.9% and −41.4%, respectively. Further, such underestimation was more prominent over the heavy precipitation area of the central region. Similarly, both SRE tends to underestimate the total frequency of R10mm; however, GSMaP-Gauge captured the spatial distribution of R10mm over the heavy precipitation area of central Nepal compared to the IMERG product. Overall, GSMaP-Gauge shows better performance in capturing the total frequency of R25mm and R10mm in central Nepal. Moreover, GSMaP-Gauge achieves higher CC indicating better spatial relevancy of heavy precipitation events with gauge observation than the IMERG product. Overall, GSMaP-Gauge shows better performance in capturing the total frequency of R25mm and R10mm in central Nepal. Moreover, GSMaP-Gauge achieves higher CC indicating better spatial relevancy of heavy precipitation events with gauge observation than the IMERG product. Two absolute indices (RX1day and RX5day) were also computed to evaluate the performance of IMERG and GSMaP-Gauge in capturing maximum one-day and consecutive five-day precipitation (Figure 11). RX1day and RX5day determine the annual maximum precipitation received in a single and five consecutive days, respectively. Compared to the distribution of heavy precipitation in Nepal, overall extreme precipitation is more concentrated in the southern flat areas (Figure 11a,d) than the middle and high-elevation areas. Two absolute indices (RX1day and RX5day) were also computed to evaluate the performance of IMERG and GSMaP-Gauge in capturing maximum one-day and consecutive five-day precipitation (Figure 11). RX1day and RX5day determine the annual maximum precipitation received in a single and five consecutive days, respectively. Compared to the distribution of heavy precipitation in Nepal, overall extreme precipitation is more concentrated in the southern flat areas (Figure 11a,d) than the middle and high-elevation areas. A highest single day extreme precipitation (>500 mm) was observed over central Nepal, and extreme five-day precipitation (>1050 mm) was observed over western Nepal (Figure 11a,b). Although both SREs underestimated the magnitude of RX1DAY during the study period, IMERG shows a better ability to detect single day extreme precipitation events than GSMaP-Gauge (Figure 11b,c). Furthermore, RX1day underestimation in GSMaP-Gauge was higher for stations located in the eastern and central regions. The spatial distribution of RX5day in both SREs is very similar to the spatial distribution of RX1day. Moreover, IMERG shows better capability in capturing both RX1day and RX5day than GSMaP-Gauge with smaller bias and RMSE values ( Figure 11). Nepal, and extreme five-day precipitation (>1050) was observed over western Nepal (Figure 11a,b). Although both SREs underestimated the total count of RX1DAY during the study period, IMERG shows a better ability to detect single day extreme precipitation events than GSMaP-Gauge (Figure 11b,c). Furthermore, RX1day underestimation in GSMaP-Gauge was higher for stations located in the eastern and central regions. The spatial distribution of RX5day in both SREs is very similar to the spatial distribution of RX1day. Moreover, IMERG shows better capability in capturing both RX1day and RX5day than GSMaP-Gauge with smaller bias and RMSE values ( Figure 11).

Discussion
Nepal is characterized by complex topography and the inherent north-south and east-west heterogeneity of the precipitation distribution. Sparse station network and only a few stations in the higher mountainous region limits our understanding of precipitation

Discussion
Nepal is characterized by complex topography and the inherent north-south and east-west heterogeneity of the precipitation distribution. Sparse station network and only a few stations in the higher mountainous region limits our understanding of precipitation pattern and its distribution. Nevertheless, SRE plays a crucial role in providing this information, especially in the data-scarce region. Although SREs offer a higher spatial and temporal resolution, several factors like precipitation estimating algorithm from reflectivity, onboard sensors, and the representativeness of the ground station used to calibrate the product, and their spatial distribution creates uncertainties in these products [18]. SREs have major limitations in complex topography to correctly estimate the precipitation as the interaction of the weather system with the orography at a local scale is often causing the spatial variability of precipitation. This study evaluates the two GPM-era SREs (IMERG and GSMaP-Gauge) in Nepal using 279 gauge stations. Evaluations are based on multiple statistical metrics and their performance to capture precipitation extremes using six extreme precipitation indices.
As of the results presented above in the result section, both SREs can capture the temporal and spatial pattern of precipitation on a broad scale; however, there are several performance differences. A striking weakness of both SREs is a severe underestimation of precipitation over the central region of Nepal, where maximum precipitation is observed, particularly SREs missed to locate well-known high precipitation areas in central Nepal (Lumle area). Similar findings were also reported in recent studies [18,32,48]. Meanwhile, Shreshta M. et al. [82] also found a similar performance of the GSMaP product in Nepal. The gauge-corrected GSMaP product showed a better capability to capture spatial variability in complex terrain than IMERG, where orography plays a dominant role. The latest version of GSMaP-Gauge (V07) has an updated orographic/non-orographic precipitation classification scheme applied in the production [69], which can be attributed to better performance of GSMaP-Gauge than IMERG product over the mountainous areas.
IMERG is a merged precipitation product from IR and PMW sensors, while GSMaP-Gauge comprises data from multiple PMW sensors [37]. It is worth mentioning that cloud tops would be too warm for IR thresholds and it lacks much ice aloft to be detected by the PMW sensors as well [36]. Meanwhile, the inability of IR sensors to resolve the multi-layered raining clouds during JJAS could be another reason for such underestimation [34]. Although GSMaP-Gauge well captured the spatial variability; however, IMERG showed better performance than GSMaP-Gauge in estimating precipitation amount on the seasonal and daily time scale. In terms of categorical skill scores (POD, FAR, CSI, and FBI), a substantial performance difference was observed in both SREs, especially to capture the precipitation magnitude. The light precipitation usually occurs at high-elevation areas of the country, where orography corrected GSMaP-Gauge outperforms IMERG product in detecting light precipitation intensity (0.5 mm/d) during the study period. In contrast, the IMERG product showed better skill to detect higher precipitation intensity >2 mm/d, as compared to GSMaP-Gauge. Similarly, previous studies also showed the weak performance of GSMaP-Gauge in capturing heavy precipitation intensity than light precipitation [18,23,48,83]. Furthermore, a study conducted in coastal mountains of tropical regions revealed that the SREs algorithm considering only high echo-top height results in heavy precipitation that may limit the detecting capability of SRE over mountainous regions, characterized by orographic precipitation with relatively low echo-height top [84].
Evaluation of six extreme precipitation indices shows that IMERG is better in capturing daily extremes (i.e., RX1day and RX5day) than GSMaP-Gauge; however, they both underestimated the magnitude. Similarly, Navarro et al. [85] also found that the IMERG product is better at detecting precipitation maxima in the complex terrain of the Pyrenees in the Iberian peninsula. The occurrences of extreme daily precipitation are more pronounced over Nepal's southern flat region, and relatively better performance of IMERG over such flat region is also evident in a previous study [86]. Although IMERG shows a good ability to capture daily extremes, GSMaP-Gauge showed a better capability to capture the duration based extreme indices (i.e., CDD and CWD). Moreover, the IMERG product overestimated the CDD, CWD spells and underestimated the R10mm and R25mm, whereas GSMaP-Gauge overestimated the total frequency of CWD spells. Such positive and negative bias in both SRE might be related to the bias correction schemes, which smooth out the precipitation time-series by detecting non-rainy days as rain events and hence reduced the extreme precipitation amplitude [87]. Furthermore, Xiao et al. [23] concluded that SREs are better able to capture the single-day extremes than continuous days of rainfall, which might be attributed to underestimating/overestimating CDD and CWD spells.

Conclusions
This study assesses the spatiotemporal performance of two GPM-era SREs (IMERG and GSMaP-Gauge) in a reference to 279 gauge stations from Nepal during 2014-2019. Additionally, both SREs' performance on capturing extreme precipitation events were also evaluated during the study period.
Both SREs (IMERG and GSMaP-Gauge) well captured the temporal variation of observed precipitation; however, the negative bias in the IMERG product was slightly smaller (−2.49 mm/day) than GSMaP-Gauge. In contrast, GSMaP-Gauge was more consistent than IMERG in capturing the spatial variability of gauge observed precipitation over Nepal.
Categorical skill score computed at multiple thresholds for defining rain/no-rain event in daily data shows that both SRE can detect the light precipitation, however their performance declines with the increasing precipitation intensity. Furthermore, the IMERG product consistently outperformed GSMaP-Gauge for the higher precipitation intensity.
For extreme precipitation events, the IMERG product showed better skill in capturing daily extremes than GSMaP-Gauge with a smaller negative bias, whereas GSMaP-Gauge outperforms the IMERG product and shows a consistent performance for duration-based indices (CDD, CWD, R10mm, R25mm). Moreover, the IMERG product shows a better ability to detect the single-day precipitation extremes; however, GSMaP-Gauge is better in detecting threshold and duration-based extreme precipitation indices.
Overall, the IMERG product showed better performance than GSMaP-Gauge to estimate the precipitation amount consistent with gauge observations over Nepal and suggested that IMERG can be a good alternative precipitation product for the data-gap region to evaluate extreme events. However, there is still space for improvement in the rainfall retrieval algorithm of SREs in the mountainous region, where topography greatly influences precipitation distribution. Therefore, rigorous evaluation considering topography and precipitation type with region-specific robust bias correction is required before any application of SREs. For algorithm developers and data users, this study further recommends continuous evaluation of these SREs, focusing on major drawbacks under the catch of warm-orographic precipitation.  Data Availability Statement: The GPM Level 3 Integrated Multi-Satellite Retrievals for Global Precipitation Measurement Version 6 (IMERG V6), daily (GPM_3IMERGDF) data used in this study can be freely accessed from NASA GES DISC (https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGDF_ 06/summary?keywords=IMERG (accessed on 8 February 2021)), and the hourly GSMaP (GSMaP-Gauge) data can be freely obtained at ftp://rainmap:Niskur+1404@hokusai.eorc.jaxa.jp/standard/ v7/hourly_G/ (accessed on 8 February 2021). The daily data sets for all the stations used in this study can be purchased from DHM, Government of Nepal (http://www.dhm.gov.np/ (accessed on 8 February 2021)).