Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements

Abdelrazaq, Abdulrahman Saeed; Alnuaimi, Humaid Abdulla; Baig, Faisal; Elkollaly, Mohamed; Sherif, Mohsen

doi:10.3390/rs18010095

Open AccessArticle

Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements

by

Abdulrahman Saeed Abdelrazaq

¹,

Humaid Abdulla Alnuaimi

¹,

Faisal Baig

^2,*,

Mohamed Elkollaly

²

and

Mohsen Sherif

²

¹

Department of Information Technology, College of IInformation Technology, UAE University, Al Ain P.O. Box 15551, United Arab Emirates

²

National Water and Energy Center, UAE University, Al Ain P.O. Box 15551, United Arab Emirates

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(1), 95; https://doi.org/10.3390/rs18010095

Submission received: 11 November 2025 / Revised: 18 December 2025 / Accepted: 22 December 2025 / Published: 26 December 2025

(This article belongs to the Section Atmospheric Remote Sensing)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

MSWEP v2.8 performed at par with IMERG and CMORPH over the UAE (2004–2020) with lower RMSE and balanced KGE values.
Seasonal and event-based analyses show MSWEP reliably captures both frontal and convective rainfall events in arid environments.

What are the implications of the main findings?

MSWEP provides dependable rainfall estimates for flood forecasting, drought monitoring, and water resource modeling in data-scarce regions.
The results support MSWEP as a benchmark product for regional calibration and climate impact assessments across the Arabian Peninsula.

Abstract

Accurate precipitation data is vital for hydrological modeling, climate research, and water resource management, especially in arid regions like the United Arab Emirates (UAE), where rainfall is sparse and highly variable. This study assesses the performance of the Multi-Source Weighted-Ensemble Precipitation v2.8 (MSWEP) dataset against ground-based gauge data and three satellite precipitation products—CMORPH, IMERG, and GSMaP—across the UAE from 2004 to 2020. Evaluation metrics include statistical, categorical, and extreme precipitation indices. MSWEP shows a moderate correlation with gauge data (mean CC = 0.62), performing better than CMORPH (0.54) but below IMERG (0.68). It also yields lower RMSE and MAE than CMORPH and GSMaP, indicating improved error metrics. However, MSWEP overestimates light rainfall and underestimates extreme events, reflected in a lower KGE (0.42) and weak performance in the 95th percentile rainfall, especially in coastal and mountainous areas. Seasonal analysis reveals overestimation in winter and underestimation during summer convective storms. While MSWEP offers strong global coverage and temporal consistency, its application in arid environments like the UAE requires bias correction. These findings highlight the need for integrating multiple datasets and regional adjustments to enhance rainfall estimation accuracy for hydrological and climate-related applications.

Keywords:

arid regions; MSWEP precipitation; rain gauge validation; satellite rainfall estimates

1. Introduction

Precipitation is one of the most impactful hydrological phenomena, significantly impacting terrestrial ecosystems, socio-economic systems, and hydrological processes. Unlike other meteorological variables, precipitation manifests as intermittent events that exhibit spatial and temporal variability [1,2]. Rainfall intensity distributions govern key hydrological processes, including surface runoff generation, groundwater recharge, and river baseflow, thereby influencing water resource management and agricultural productivity. However, when rainfall intensities exceed infiltration and drainage capacities, particularly in arid and semi-arid environments, they can trigger floods and flash floods, causing severe damage to infrastructure, disrupting transportation networks, degrading agricultural land, and threatening human lives [3,4]. Quantitative precipitation estimation plays a crucial role in climate change studies and disaster mitigation efforts, particularly in guiding the safe design of water infrastructure [5]. Accurate data on rainfall distribution at high resolutions are essential for predicting dynamic surface hydrologic states, yet achieving such precision remains a challenge due to the heterogeneous nature of precipitation in space and time [1,6].

Traditionally, rain gauge networks are reliable sources of direct precipitation measurements, providing point-scale observations essential for model calibration, validation, and forecasting [7]. Despite their reliability, rain gauges are subject to errors arising from calibration issues, maintenance challenges, and wind effects, which may lead to underestimation of precipitation [8,9,10,11,12]. Common issues include snow blowing, raindrop splashing, wetting losses, and evaporative losses [13]. Additionally, rain gauges are limited by their spatial interpolation, which depends more on network geometry than on actual precipitation distribution [5]. Sparse networks, particularly in tropical rainforests and mountainous regions, exacerbate these limitations [14,15]. Rain gauges in high-altitude areas often suffer from harsh climatic conditions, resulting in seriously biased measurements [6]. While weather radar systems can provide an overview of rainfall distribution, spatially and temporally, radars are also hindered by challenges such as ground clutter and beam blockage, which could degrade data quality. Moreover, radar systems remain financially and technically inaccessible for many developing nations [16,17].

Satellite-based rainfall estimates have emerged as a promising alternative to ground-based observations, offering extensive spatial coverage and high temporal resolution [18]. Since the 1970s, satellite observations have been utilized to extract global precipitation information, becoming indispensable for advancing hydrological, atmospheric, and climate sciences [19,20]. These products are effective substitutes in regions with sparse or unreliable rain gauges [13]. In particular, satellite precipitation products (SPPs) provide valuable data for flood prediction, drought monitoring, and climate change research [21,22,23].

The Global Precipitation Measurement (GPM) mission, an international collaboration led by NASA and the Japanese Aerospace Exploration Agency (JAXA), launched its Core Observatory satellite in 2014 to set a new standard in remotely sensed precipitation measurements [17,24]. The Integrated Multi-Satellite Retrievals for GPM (IMERG) dataset, a key product of this mission, offers global coverage with a high spatial resolution of 0.1° and a temporal resolution of 30 min [24,25]. IMERG provides three levels of products: near-real-time Early and Late-run products and a post-real-time Final product [21]. Unlike earlier satellite missions, such as TRMM, GPM-IMERG employs advanced sensors to detect both light and heavy rainfall as well as snowfall. Due to its high accuracy, IMERG has been widely used in hydrological and meteorological applications [8,26,27,28,29]. Another competitive satellite dataset called CMORPH was developed by the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center dataset, utilizing infrared (IR) and low-orbit passive microwave (PMW) data to estimate rainfall [30,31]. In addition, as a part of the GPM program, the precipitation measurement team at JAXA, Japan, developed and released the GSMaP product that offers valuable precipitation estimates [32]. The MSWEP (Multi-Source Weighted-Ensemble Precipitation) dataset is developed and maintained by GloH2O (Global Hydrology and Climate Solutions), offers global precipitation data with a 3-hourly temporal resolution spanning from 1979 to the near-present [33]. MSWEP V2 represents the first fully global precipitation dataset with a spatial resolution of 0.1°, enabling hyper-resolution land-surface modeling worldwide. This dataset stands out by leveraging the complementary strengths of gauge-, satellite-, and reanalysis-based data to provide reliable precipitation estimates [34,35], Several studies in other regions, including arid regions like Saudi Arabia, have shown that several satellite precipitation products may pose errors due to multiple sensor-based or instrument-based discrepancies [36,37,38,39,40,41]

In the United Arab Emirates (UAE), some satellite precipitation products have been evaluated. Four datasets from the PERSIANN family were compared over ten years, from 2011 to 2020, in the UAE. PERSIANN-CDR was identified as the most reliable dataset for capturing extreme rainfall events and spatial distribution over the region [42]. Trend analyses using IMERG v6 and CHIRPS (Climate Hazards Group InfraRed Precipitation with Station data) indicated an increase in total precipitation and consecutive wet days across three selected UAE watersheds [43]. In addition, Baig et al. [8] utilized CMORPH, PERSIAN, and IMERG datasets along with rain gauge data to investigate the 17-year rainfall trend analysis over the UAE. Other studies validated the applicability of IMERG and CMORPH for understanding rainfall variability, highlighting their potential to complement or replace ground-based measurements in the UAE [26,44].

The four promising satellite-based rainfall products employed in this study, GSMAP, IMERG, CMORPH, and MSWEP, are widely utilized worldwide due to their reliability and global applicability. However, to the best of our knowledge, no previous study has utilized the MSWEP dataset in the context of the United Arab Emirates (UAE). The primary goal of this study is to test and calibrate the newly developed MSWEP dataset over the UAE, a region with highly variable rainfall and critical water resource management needs. Using daily rainfall data from 38 rain gauge stations distributed across the UAE, this study evaluates the performance of MSWEP against three well-established satellite rainfall products: CMORPH, IMERG, and GSMaP. Despite its demonstrated strengths, MSWEP is not without limitations—particularly in regions with sparse or uneven gauge coverage, such as the UAE. Because MSWEP relies partly on gauge-based bias correction to merge satellite and reanalysis inputs, its accuracy can diminish in data-scarce areas where limited ground observations constrain the calibration process. The United Arab Emirates, the focus of this study, provides a unique testing ground for satellite precipitation products due to its arid climate and complex topography. The region experiences annual rainfall averaging less than 100 mm in most areas, with highly uneven distribution across seasons and locations [36]. Rainfall events, although infrequent, are crucial for groundwater recharge and vegetation, making accurate rainfall estimation vital for sustainable water resource management and urban planning.

Despite the growing global adoption of MSWEP, its performance has been tested primarily in humid, monsoonal, or temperate contexts, with limited validation in hyper-arid environments. These regions pose unique retrieval challenges due to sparse precipitation, dominance of warm-cloud rainfall, low gauge density, and pronounced orographic influences. Consequently, a critical gap exists in understanding whether MSWEP’s multi-source merging framework is robust under hyper-arid dynamics. This study addresses this gap by providing a comprehensive validation of MSWEP over the UAE.

Although MSWEP has been evaluated at global and continental scales, its behavior in hyper-arid environments remains poorly constrained. Hyper-arid regions are characterized by strongly zero-inflated rainfall distributions, infrequent but intense convective events, rapid evaporation, and sparse gauge networks used for bias correction. These conditions introduce uncertainties in how multi-source precipitation products represent rainfall occurrence, intensity, and extremes. In particular, it is unclear how reliably MSWEP performs under extreme aridity, where precipitation is rare but hydrologically significant. Addressing this gap, the present study provides a comprehensive evaluation of MSWEP over the United Arab Emirates by benchmarking its performance against gauge observations and multiple satellite products across statistical, seasonal, and extreme-event dimensions.

This study addresses a critical research gap concerning the limited evaluation of multi-source precipitation products in hyper-arid environments, where sparse observation networks and localized convective events present substantial challenges for satellite retrieval accuracy. Building on this gap, we hypothesized that MSWEP—despite its advanced multi-source merging framework—would exhibit performance variability across different seasons and physiographic settings of the UAE, and that benchmarking it against CMORPH, IMERG, and GSMaP would reveal systematic strengths and limitations relevant to hydrological applications. By systematically assessing these four datasets using statistical, categorical, and extreme rainfall metrics, the study provides clear, application-oriented insights into their reliability for arid-region hydrology and climate analysis. The results not only contribute to the broader understanding of satellite precipitation skill in dry climates but also support the refinement of retrieval algorithms and inform future efforts in water-resource planning, hazard forecasting, and regional dataset calibration. Reference formatting has been standardized throughout the manuscript in accordance with journal guidelines.

Study Area

The United Arab Emirates (UAE) is situated in the southeastern part of the Arabian Peninsula, bordered by Saudi Arabia and Oman (Figure 1). It is characterized by an arid climate with extreme temperatures and scarce precipitation. The country experiences highly variable and spatially heterogeneous rainfall patterns, with annual precipitation largely influenced by convective storms and orographic effects in the eastern mountainous regions. While most of the country receives minimal rainfall, localized areas, particularly in elevated regions, experience relatively higher precipitation due to topographical influences [8,36]. Precipitation monitoring in the UAE is managed by the National Center of Meteorology (NCM), which operates an extensive network of automatic rain gauge stations across the country. These stations are strategically distributed to capture precipitation variability across diverse landscapes, including coastal areas, deserts, and mountainous regions [16]. Rainfall in the UAE is typically low and unevenly distributed both spatially and temporally, with annual averages ranging between 80 mm and 120 mm depending on the location [45]. Seasonal rainfall trends are heavily influenced by the region’s unique geography, including coastal plains, mountainous terrain, and desert landscapes [46]. Rainfall is primarily concentrated during the winter and early spring months, associated with mid-latitude cyclones and frontal systems, while the summer months are largely dry. The spatial distribution of total annual average rainfall (Figure 1c) highlights a clear gradient, with higher precipitation amounts concentrated in the northeast and significantly lower values in the inland desert areas. This variability presents a challenge in hydrological modeling and water resource management, emphasizing the need for accurate and reliable precipitation datasets. Despite its sporadic nature, rainfall is a critical hydrological input for groundwater recharge and ecological processes in the UAE. The availability of high-resolution gauge data allowed for a robust evaluation of satellite rainfall products, providing valuable insights into their ability to capture the unique rainfall patterns of this arid environment.

2. Materials and Methods

2.1. Data Sources

This study utilized both ground-based rain gauge data and satellite precipitation products to evaluate the accuracy of the MSWEP dataset and compare its performance with CMORPH, IMERG, and GSMaP. The precipitation data from 2004 to 2020 were utilized for the gauge and the satellite products. The rain gauge data served as the ground truth, while the satellite datasets provided spatially extensive rainfall estimates, each with distinct characteristics in terms of spatial resolution, temporal resolution, and retrieval algorithms.

The daily ground-based rain gauge data were obtained from the National Center of Meteorology (NCM) in the United Arab Emirates (https://www.ncm.gov.ae). This dataset comprised daily rainfall measurements from 38 rain gauge stations distributed across the UAE. The gauges are well-maintained and calibrated to provide reliable and accurate rainfall observations. These stations span diverse geographical areas, including coastal plains, mountainous regions, and desert interiors, capturing the spatial variability of rainfall in this arid region. The dataset covered the period from 1 January 2004 to 31 December 2020, providing a robust reference for validating satellite precipitation estimates.

The MSWEP V2.8 dataset, developed by [33], is an advanced Multi-Sourced Weighted Ensemble Precipitation product that integrates gauge observations, satellite data, and reanalysis estimates. MSWEP provides high-resolution precipitation estimates at 0.1° (approximately 10 km at the equator) and 3 h temporal resolution (https://www.gloh2o.org/mswep/). The dataset employs innovative weighting algorithms and bias correction techniques to harmonize data from multiple sources, enhancing accuracy and reducing inconsistencies. By leveraging the strengths of ground-based and remotely sensed measurements, MSWEP delivers reliable rainfall estimates globally, including over regions with limited in situ data coverage. While MSWEP has demonstrated utility in hydrological and climate applications, its performance in arid regions like the UAE, characterized by unique precipitation patterns, remains underexplored, emphasizing the importance of this study in validating and assessing its applicability in such environments.

The Climate Prediction Center morphing technique (CMORPH), outlined by NOAA, is a widely used satellite precipitation product that generates global rainfall estimates by morphing precipitation features observed by passive microwave sensors. It operates at a spatial resolution of 0.25° and provides precipitation estimates at 30 min intervals, which can be aggregated to daily totals. CMORPH employs a motion-vector-based approach to propagate microwave-derived precipitation fields using infrared imagery, thereby ensuring high temporal resolution. However, the reliance on infrared data during microwave sensor gaps can introduce uncertainties, particularly in regions with low rainfall intensities, such as the UAE. CMORPH-CRT daily precipitation data is used in this study and downloaded from https://ftp.cpc.ncep.noaa.gov/precip/PORT/SEMDP/CMORPH_CRT accessed on 10 July 2025.

The Integrated Multi-satellite Retrievals for GPM (IMERG), developed as part of NASA’s Global Precipitation Measurement (GPM) mission, is one of the most advanced satellite precipitation products available. IMERG combines data from an international constellation of satellites, including GPM’s Dual-Frequency Precipitation Radar (DPR) and passive microwave sensors. It provides precipitation estimates at a spatial resolution of 0.1° (approximately 10 km) with three temporal products: half-hourly, daily, and monthly. IMERG employs a complex algorithm that merges microwave and infrared data with surface observations to enhance accuracy. Its high spatial and temporal resolution makes it suitable for diverse applications, although performance can vary across different climatic zones. The Daily IMERG V07 Final product utilized in this study has been available since June 2000. All IMERG datasets can be accessed and downloaded through NASA’s official website (http://pmm.nasa.gov/data-access/downloads/gpm, accessed on 12 July 2025).

The GSMaP dataset, developed by the Japan Aerospace Exploration Agency (JAXA), is another prominent satellite precipitation product. GSMaP provides rainfall estimates at a spatial resolution of 0.1° and a temporal resolution of one hour, which can be aggregated to daily values. GSMaP integrates data from multiple passive microwave sensors, using a microwave-infrared blend algorithm to fill gaps in coverage. The product also incorporates gauge-calibrated data in its real-time and reanalysis versions to enhance accuracy. While GSMaP performs well in tropical and subtropical regions, its performance in the arid areas, where light and sporadic rainfall dominate, requires further evaluation. For this study, the GSMaP near-real-time (NRT) product was acquired from https://www.gportal.jaxa.jp and has a latency of 4 h.

Each of these datasets brings unique strengths and limitations to rainfall estimation. While gauge data offer precise point measurements, satellite products provide the spatially continuous coverage necessary for hydrological and climatological applications. This study leverages these complementary datasets to comprehensively evaluate the accuracy and reliability of the MSWEP product in the context of an arid environment. Table 1 provides the full details for the four satellite precipitation products described above.

2.2. Data Extraction

Daily rainfall values for the satellite precipitation products were extracted from NetCDF files using a nearest-neighbor interpolation approach. Nearest-neighbor extraction was selected to preserve the native rainfall signal from each satellite grid cell without imposing spatial smoothing. However, we acknowledge that in steep mountainous regions, this approach may introduce spatial sampling errors where sub-grid rainfall variability is high. The NetCDF files provide gridded rainfall data with geographic coordinates (latitude and longitude) and time dimensions. To align the satellite data with the locations of the 38 rain gauge stations, the latitude and longitude of the gridded satellite data were extracted from the NetCDF files, and the coordinates of the rain gauge stations were used as reference points. A KD-tree algorithm efficiently matched each rain gauge station to its nearest grid cell in the satellite dataset by computing the Euclidean distance between the station and all grid points, identifying the closest cell for rainfall value extraction. Daily rainfall values were extracted from the satellite data to maintain consistency with the temporal resolution of gauge data, and time variables in the NetCDF files were converted to a standard calendar format for matching. The extracted rainfall data for each satellite product was stored in tabular format, with rows representing rain gauge stations and columns containing daily rainfall values.

2.3. Data Pre-Processing

Pre-processing was conducted to ensure the quality, reliability, and consistency of both gauge and satellite datasets. The gauge data underwent a structured quality-control procedure that included duplicate removal and explicit handling of missing values. Missing daily rainfall valuesvwere linearly interpolated only when the missing gap was ≤2 consecutive days, ensuring that interpolation did not artificially distort rainfall sequences. Longer gaps were excluded from subsequent analyses. Outliers were identified using a station-specific climatological threshold, defined as values exceeding the 99.9th percentile or falling below 0 mm, which are physically implausible in this context. All flagged values were cross-checked against NCM metadata and neighboring stations before being removed. Temporal consistency checks ensured the absence of duplicated timestamps, gaps, or overlaps throughout the 2004–2020 period.

For the satellite datasets, grid-cell coordinates were validated against station locations using KD-tree matching to confirm spatial extraction accuracy. Satellite missing values arising from sensor limitations or retrieval masks were retained as NaN, preventing unintended bias during metric calculations. All datasets were standardized to daily units (mm/day), and timestamps were synchronized to maintain temporal uniformity across gauge and satellite series. Finally, cross-validation routines were applied to summarize valid records and exclude stations or periods with insufficient data coverage.

2.4. Statistical Measures

Various statistical measures were employed to assess the quantitative agreement between satellite-based rainfall estimates and gauge observations (Table 2). The correlation coefficient (CC) was used to determine the strength and direction of the linear relationship between satellite and gauge data, with values closer to one indicating a stronger correlation. Root Mean Square Error (RMSE) measured the overall magnitude of error, with lower values representing better agreement. Mean Absolute Error (MAE) assessed the average magnitude of errors, providing a robust indicator of accuracy without being overly sensitive to outliers. The Kling-Gupta Efficiency (KGE), a composite metric incorporating correlation, bias, and variability, was used to provide a comprehensive measure of dataset performance, where values closer to one indicate higher agreement. Additionally, the Nash-Sutcliffe Efficiency (NSE) was utilized to evaluate the predictive skill of satellite-based rainfall estimates relative to gauge observations. NSE values range from negative infinity to one, with values closer to one signifying better predictive accuracy. A value of zero indicates that the model performs no better than the mean of observed data, while negative values suggest that the mean observation is a better predictor than the satellite estimates.

2.5. Categorical Metrics

The ability of satellite products to detect rainfall events was evaluated using categorical metrics [47]. Probability of Detection (POD) assesses the fraction of observed rainfall events (daily rainfall > 1 mm) correctly detected by the satellite products. False Alarm Ratio (FAR) quantifies the proportion of rainfall events predicted by the satellite data but not observed in the gauge records, where lower values indicate fewer false alarms. The Critical Success Index (CSI) balances POD and FAR to provide an overall measure of event-detection accuracy, with higher values indicating better performance. Finally, the Frequency Bias Index (FBI) examines whether satellite datasets tended to overestimate or underestimate the occurrence of rainfall events. An FBI value of one indicated unbiased detection, while values above or below one reflected overestimation or underestimation, respectively. Details of these measures is presented in Table 2.

2.6. Extreme Rainfall Indices

The study also assessed the ability of satellite products to replicate extreme rainfall characteristics using various climate indices [48] (Table 2). Consecutive Dry Days (CDD) represented the maximum number of consecutive days with rainfall less than 1 mm, providing a measure of aridity. Conversely, Consecutive Wet Days (CWD) reflected the persistence of wet periods, denoting the maximum number of consecutive days with rainfall exceeding 1 mm. The maximum 1-day (Rx1 day) and 5-day (Rx5 day) precipitation totals were used to evaluate the ability of satellite datasets to capture extreme rainfall events that are critical for flood risk assessment. Additionally, the annual counts of days with rainfall exceeding thresholds of 10 mm, 20 mm, and 30 mm (R10 mm, R20 mm, and R30 mm) were examined to determine how well heavy rainfall events were captured. Total annual precipitation on wet days (PRCPTOT) and the annual total precipitation from rainfall events exceeding the 95th percentile (R95P) were calculated to understand the contribution of significant and extreme rainfall events to total precipitation.

It is important to clarify the inferential scope of this study. Comparisons between gauge observations and gridded precipitation products are performed at the point scale and reflect local agreement at individual station locations. In contrast, the assessment of seasonal behavior, regional performance differences, and relative ranking among products is intended to be representative at the regional scale across the UAE. While the findings provide robust insights for dataset selection, climate analysis, and research-oriented hydrological modeling in hyper-arid environments, they should not be directly extrapolated to operational forecasting or hazard management applications without additional local calibration, uncertainty analysis, and event-specific validation.

3. Results

3.1. Statistical Measures Evaluation

Table S1 provides a comprehensive station-level evaluation of the four QPE datasets MSWEP, CMORPH, IMERG, and GSMaP using three key statistical indicators: RMSE, MAE, and KGE. This condensed framework captures both the magnitude of errors and the overall agreement between satellite-based estimates and gauge observations, enabling a more focused comparison of product efficiency across the UAE’s diverse topography.

The analysis reveals that GSMaP consistently delivers the most balanced and reliable performance across the majority of stations, as indicated by lower RMSE and MAE values and higher KGE scores. This reflects GSMaP’s robust calibration procedures and dynamic gauge-adjustment algorithms, which effectively reduce bias and improve temporal consistency. Its superior skill is particularly evident in areas with complex topography, such as Masafi, Jabal Hafeet, and Jabal Jais, where microwave-based retrievals can better resolve orographic rainfall mechanisms. The relatively low RMSE and high KGE at these locations suggest that GSMaP captures both the intensity and variability of rainfall events more accurately than the other products. CMORPH also performs favorably, particularly in mountainous and high-rainfall zones. Its morphing technique, which relies on time interpolation between microwave overpasses, appears to enhance temporal coherence. However, CMORPH occasionally underestimates peak intensities and displays higher MAE at coastal and inland stations such as Al Gheweifat and Al Ain, indicating a potential difficulty in representing light or spatially fragmented rainfall.

IMERG, while a globally advanced product incorporating multiple sensors and gauge adjustments, exhibits mixed performance across the study domain. In many inland and coastal locations, IMERG tends to overestimate light to moderate rainfall, resulting in higher RMSE and slightly lower KGE values. This overestimation likely arises from its reliance on infrared retrievals during microwave data gaps, which can misinterpret warm-cloud signatures common in arid and maritime climates as precipitation. Despite these biases, IMERG maintains reasonable performance consistency, suggesting its potential utility after localized bias correction. In contrast, MSWEP shows the weakest performance across most stations, marked by higher RMSE and MAE and lower KGE values. Its coarse spatial resolution and reliance on globally merged reanalysis and gauge data limit its effectiveness in capturing highly localized convective rainfall, which dominates the UAE’s hydroclimate. MSWEP’s static bias correction method may further reduce responsiveness to short-term rainfall variability, leading to underestimation during peak events and misrepresentation of rainfall totals.

Spatially, the statistical measures illustrate a clear gradient in QPE performance from mountainous to inland areas. Stations with complex terrain and higher rainfall variability show better overall agreement between satellite and gauge data, whereas flat, arid inland sites display larger discrepancies. This pattern emphasizes the combined influence of topography, rainfall intensity, and algorithmic design on QPE accuracy. Critically, while GSMaP and CMORPH outperform other datasets, none of the QPEs exhibit universally high accuracy across all stations. This finding underscores the inherent challenge of applying global precipitation products to a hyper-arid region with complex terrain and sporadic rainfall. It also highlights the need for tailored post-processing methods—such as regional bias correction, merging with high-density gauge networks, or hybrid machine learning frameworks to optimize rainfall estimation for hydrological and climate applications in the UAE.

The regional comparison (Figure 2) provides an integrated view of the mean normalized performance of all four QPE datasets—MSWEP, CMORPH, IMERG, and GSMaP—across three physiographically distinct zones: Mountainous, Coastal, and Inland regions. The evaluation is based on three key metrics: RMSE, MAE, and KGE, where values were normalized so that higher scores indicate better performance. This figure serves to reveal the regional dependency of satellite rainfall retrieval skill in the UAE and highlights the influence of topography and proximity to the coast on QPE accuracy.

The Mountainous region exhibits the highest overall performance among all QPEs, particularly for GSMaP and CMORPH, which consistently achieve superior normalized scores across all three metrics. This strong performance can be attributed to the frequent occurrence of orographic precipitation and well-defined convective structures that are better captured by microwave sensors integrated into these algorithms. The lower RMSE and higher KGE values indicate that both GSMaP and CMORPH effectively represent rainfall magnitude and temporal variability in topographically complex environments. In contrast, IMERG, despite its advanced calibration and dense sensor network, tends to overestimate rainfall intensity in these areas, likely due to the dominance of convective echoes misinterpreted by its infrared components.

In the Coastal region, the performance of all QPEs diminishes slightly, but GSMaP remains the most reliable dataset. The reduced accuracy in this zone is likely linked to the influence of marine atmospheric conditions—such as sea-breeze convection and shallow maritime clouds—that often challenge satellite-based retrievals. The lower skill of MSWEP and IMERG in coastal settings reflects their difficulties in resolving light or short-lived rainfall events, which are typical near the coast.

The Inland region exhibits the weakest performance across all QPEs, with the highest normalized errors (RMSE and MAE) and relatively low KGE values. This degradation is expected in hyper-arid areas where rainfall events are infrequent, localized, and often below the sensitivity thresholds of satellite retrieval algorithms. The coarse spatiotemporal resolution of global QPE products limits their ability to detect sporadic convective storms, leading to larger discrepancies compared to gauge observations.

The regional analysis clearly demonstrates that GSMaP consistently outperforms other QPEs in both mountainous and coastal zones, while CMORPH provides competitive accuracy in mountainous terrain. These results emphasize the strong topographic control on QPE reliability and highlight the necessity of regional bias correction or hybrid merging strategies to improve precipitation representation in inland desert areas. The insights from this regional assessment provide an essential foundation for developing localized satellite–gauge merging frameworks tailored to the hydroclimatic diversity of the UAE.

3.2. Rainfall Intensity Frequency Comparison

Figure 3 compares gauge observations with four QPE datasets—MSWEP, IMERG, CMORPH, and GSMaP across five rainfall intensity categories (Very Light, Light, Moderate, Heavy, and Extreme) during 2004–2020. The gauge data indicate that Very Light and Light rainfall events account for approximately 78% of total occurrences. In comparison, Moderate rainfall accounts for about 15%, and Heavy to Extreme rainfall accounts for less than 7%.

Among the QPEs, MSWEP most closely matches the gauge distribution, capturing 76% of events in the light rainfall range and 17% in the moderate range, showing its strength in reproducing overall event frequency. IMERG performs comparably but tends to overestimate light rainfall (≈82%) and underrepresent heavier intensities (>25 mm). CMORPH exhibits a bias toward underestimating light rain (≈70%), whereas GSMaP shows a flatter distribution, slightly overpredicting heavy rainfall events (>30 mm).

The comparison highlights MSWEP’s relatively balanced detection capability across all intensity categories, while the other satellite products show systematic biases—either underrepresenting or overrepresenting specific rainfall intensities. These differences have implications for hydrological modeling in arid environments, where accurate representation of low- to moderate-intensity rainfall is essential for water resource and flood risk assessments.

3.3. Categorical and Extreme Indices

The comparison of extreme indices highlights the strengths and limitations of each dataset in terms of detection capability, false alarm rates, and overall agreement with ground-based measurements (Figure 4). Each subplot (a–d) corresponds to one of the categorical indices, enabling a direct comparison across the products. In subplot (a), the POD values indicate the models’ ability to correctly detect precipitation events. MSWEP and IMERG exhibit consistently high POD values, with IMERG showing slightly higher values overall. GSMaP also performs reasonably well, while CMORPH shows a noticeable spread and lower POD values, suggesting it struggles to detect precipitation consistently across the region. Subplot (b) shows the FAR values, which measure the proportion of false precipitation detections; therefore, lower FAR values indicate better performance. A high FAR in arid regions such as the UAE carries particular implications due to the region’s low rainfall frequency and short-lived convective systems. In such environments, even minor overestimations of rainfall frequency can significantly distort perceptions of wet conditions, leading to false flood warnings or misinformed water management decisions. The elevated FAR observed for IMERG and GSMaP in Figure 4b suggests a tendency of these products to misclassify non-rainy pixels as rainfall, likely due to challenges in distinguishing light cloud signatures from actual precipitation under warm, dry atmospheric conditions. Subplot 4(c) displays the CSI values, which account for both hits and false alarms to assess overall success. MSWEP demonstrates the highest median CSI, followed by GSMaP and IMERG, indicating that these products achieve a better balance between correct detections and false alarms. CMORPH exhibits lower CSI values and greater variability, reflecting its limited ability to accurately capture precipitation events. Figure 4c shows that MSWEP achieves a more balanced detection of rainfall events, minimizing missed detections while avoiding excessive false alarms. This balance makes MSWEP comparatively more reliable for hydrological and operational applications where both overprediction and underprediction can have profound implications. Finally, subplot (d) presents the FBI values, which measure the tendency to overestimate or underestimate precipitation events. All four products exhibit FBI values greater than 1, indicating a general tendency to overestimate precipitation. MSWEP shows the least overestimation, with lower variability than the other products. GSMaP and IMERG exhibit higher median FBI values, while CMORPH shows the greatest spread, suggesting significant overestimation in some areas.

MSWEP shows consistent performance across all categorical indices, with balanced detection capabilities (POD (0.7), FAR (0.65), better overall success (CSI (0.35)), and minimal bias (FBI (2)). IMERG also performs well, particularly in POD, but shows a slightly higher FBI, indicating a modest tendency to over-detect rainfall events compared to gauges; however, this difference is within an acceptable range and is not statistically significant. GSMaP demonstrates moderate performance but exhibits higher false alarm ratios and bias. CMORPH, on the other hand, shows the weakest performance, with substantial variability and lower scores across most indices, highlighting its limitations in accurately detecting and quantifying precipitation in the study region.

3.4. Extreme Rainfall Indices Evaluation

Consecutive Dry Days (CDD) (Figure 5a) quantifies the length of dry spells. Gauge observations exhibit moderate CDD values, representing typical intermittent rainfall behavior over the UAE. Both MSWEP and IMERG tend to overestimate the duration of dry periods, reflecting a conservative bias toward non-rainy days, while CMORPH underestimates CDD with a narrower range. GSMaP demonstrates the highest variability, indicating inconsistent dry spell detection.

Consecutive Wet Days (CWD) (Figure 5b) represents the persistence of wet periods. MSWEP shows close agreement with gauge data, with only a minor positive bias, suggesting it effectively captures short-duration wet sequences common in the region. CMORPH and GSMaP, however, show a broader spread and higher CWD values, suggesting a tendency to overestimate the continuity of wet events, whereas IMERG maintains a moderate, balanced distribution.

For Rx1 day and Rx5 day indices (Figure 5c,d), representing the maximum 1-day and 5-day precipitation totals, respectively, MSWEP captures these extreme rainfall magnitudes relatively well but with a slightly wider interquartile range. CMORPH and IMERG underestimate extreme precipitation intensities, while GSMaP exhibits high variability, reflecting inconsistent detection of short-lived convective storms.

The R10 mm and R20 mm indices (Figure 5e,f), denoting the number of days exceeding 10 mm and 20 mm thresholds, further demonstrate MSWEP’s reliability in reproducing heavy rainfall frequency. Although it slightly overestimates moderate rainfall events, its distribution remains consistent with gauge observations. In contrast, CMORPH shows weaker correspondence, while IMERG and GSMaP display broader dispersions, reflecting difficulties in accurately representing moderate to heavy rainfall occurrences.

Finally, R30 mm, PRCPTOT, and R95P (Figure 5g,i) capture very heavy rainfall days, total precipitation accumulation, and the 95th percentile of daily rainfall, respectively. MSWEP aligns most closely with gauge observations for PRCPTOT and R95P, confirming its ability to represent total and extreme rainfall amounts. A slight positive bias in R30 mm suggests a tendency to overrepresent the most intense events, which may stem from its multi-source merging strategy and global bias correction approach.

MSWEP exhibits balanced performance across all indices, effectively capturing the spatial and temporal variability of both moderate and extreme rainfall over the UAE. While IMERG performs comparably well in some indices, CMORPH and GSMaP show less consistency, particularly under conditions of localized or convective rainfall typical of arid climates. These results emphasize the potential of MSWEP as a reliable precipitation dataset for hydrological and climate applications, while also reflecting the inherent challenges satellite algorithms face in detecting extreme events in dry regions with high spatial rainfall heterogeneity.

3.5. Qualitative Measures

Figure 6 presents a comparative analysis of precipitation characteristics across gauge observations and four satellite-based precipitation products. Figure 6A shows the percentage of time without precipitation for each dataset, providing insight into the dry periods across the study region. The gauge observations indicate high dry spell frequencies, with most stations showing non-precipitation percentages exceeding 95%. The satellite products generally capture this dry period pattern, but notable differences emerge. CMORPH appears to underestimate dry conditions relative to gauge data, particularly in the southern regions, suggesting potential false precipitation detection. In contrast, MSWEP, IMERG, and GSMaP align more closely with the gauge estimates, though localized discrepancies are evident in coastal and northern areas.

Figure 6B displays the 95th percentile precipitation (95PP), representing extreme rainfall thresholds across the datasets. The gauge data exhibit substantial spatial variability, with higher values concentrated in the northeastern coastal and mountainous regions. IMERG and GSMaP exhibit the largest deviations from gauge observations, generally overestimating extreme precipitation events, particularly in the northeastern regions. CMORPH, on the other hand, tends to underestimate 95PP across most stations, indicating a potential bias toward lower rainfall intensities. MSWEP demonstrates relatively balanced performance, capturing the spatial variability of the 95th percentile precipitation (95PP), with values ranging approximately from 9 to 20 mm day⁻¹ in inland regions to over 25 mm day⁻¹ along the eastern mountains. In contrast, CMORPH and GSMaP show weaker spatial correspondence, with underestimation over the coastal and inland plains where 95PP values drop below 20 mm day⁻¹, indicating that MSWEP more effectively represents the intensity distribution across varying topographic zones. These findings suggest that while satellite products provide valuable insights into extreme precipitation characteristics, their reliability varies across regions and with precipitation dynamics.

Similarly, the annual total rainfall and its deviations from the long-term normal for the gauge observations and satellite precipitation products over the study period from 2004 to 2020 are presented in Figure 7. Panel (a) displays the total annual average rainfall recorded by each dataset, with the horizontal black line representing the long-term mean annual rainfall of 100.6 mm. It highlights interannual variability, with some years exhibiting significant above-average rainfall (e.g., 2006, 2009, and 2019), while others fall well below the historical mean (e.g., 2011, 2014, and 2017). Differences among the satellite products are evident, particularly in years with extreme rainfall events, where GSMaP frequently exhibits higher totals than the other products. Panel (b) presents the rainfall departures from the normal annual average, emphasizing wet and dry years. Positive values indicate years with above-average precipitation, while negative values reflect drier-than-normal conditions. The gauge observations and satellite datasets exhibit similar trends in most years, indicating the ability of these products to capture overall rainfall variability. However, discrepancies are notable, particularly in extreme years where some products overestimate or underestimate departures from normal conditions. GSMaP shows more pronounced peaks in wet years, while MSWEP and CMORPH tend to provide more moderate variations.

3.6. Quantitative Measures

The heatmap (Figure 8) presents an aggregated comparison of the average number of rainfall days classified into three categories: light rainfall (<5 mm), moderate rainfall (5–20 mm), and heavy rainfall (>20 mm) for the gauge data and four satellite products. The color intensity reflects the frequency of rainfall occurrences, with warmer colors indicating a higher number of days and cooler colors representing fewer days. A key observation is the substantial overestimation of light rainfall days by satellite products relative to gauge observations, particularly for MSWEP, which records more than six times as many light rainfall days as the gauge. This overestimation suggests that MSWEP tends to detect low-intensity precipitation more frequently than ground measurements. Similarly, IMERG and GSMaP also show higher counts of light rainfall days than the gauge, while CMORPH approximates more closely but remains overestimated. For moderate rainfall (5–20 mm), IMERG reports nearly three times as many days as the gauge, followed by GSMaP, which also overestimates the frequency. CMORPH, on the other hand, appears to underestimate moderate rainfall events. For heavy rainfall events (>20 mm), all satellite products show lower frequencies than gauge observations, with MSWEP and CMORPH detecting the fewest heavy rainfall days. This underestimation suggests that satellite products struggle to accurately capture extreme precipitation events, which are crucial for hydrological and flood risk assessments. Overall, the heatmap highlights significant discrepancies in the detection of rainfall frequency across different intensity levels, emphasizing the importance of further calibration and validation for improving satellite-based precipitation estimates in the region.

Similarly, the box plots in Figure 9 illustrate the distribution of rainfall event counts across the precipitation datasets GAUGE, MSWEP, CMORPH, IMERG, and GSMaP. Each box represents the interquartile range (IQR), with the median denoted by the horizontal line within the box, while whiskers extend to the minimum and maximum values within 1.5 times the IQR. Outliers are marked as individual points beyond the whiskers. The GAUGE dataset, which represents ground observations, shows a moderate spread in rainfall events, with the median slightly above 40 events. MSWEP and CMORPH exhibit distributions similar to the gauge data, but with slightly lower medians, indicating that these products capture rainfall event frequencies relatively well but tend to underestimate the extremes. IMERG, on the other hand, displays a significantly higher median and broader spread, suggesting it detects more rainfall events, likely due to overestimation of light precipitation. GSMaP also shows a higher median than GAUGE, but its spread is smaller than IMERG’s.

MSWEP and CMORPH appear to align more closely with the gauge-based event distribution, while IMERG tends to overestimate the number of rainfall events, potentially due to a lower detection threshold. GSMaP also shows a tendency toward higher event counts but with slightly lower variability than IMERG. These variations highlight differences in how each satellite-based dataset detects and records precipitation events in comparison to ground observations.

3.7. Seasonal Analysis

Seasonal analysis is essential for understanding the performance consistency of satellite precipitation products (SPPs) under varying hydroclimatic regimes. In arid and semi-arid regions such as the UAE, rainfall is highly episodic and often concentrated in short winter periods driven by synoptic-scale systems. At the same time, the rest of the year remains mostly dry. Evaluating satellite-based quantitative precipitation estimates (QPEs) at an annual scale may therefore obscure seasonal disparities in detection accuracy and bias patterns. A seasonal breakdown—winter (DJF), spring (MAM), summer (JJA), and autumn (SON)helps reveal how retrieval algorithms respond to distinct precipitation mechanisms (e.g., convective vs. stratiform rain) and atmospheric conditions such as humidity, cloud-top temperature, and wind shear, which directly influence microwave and infrared signal retrievals.

Figure 10 (Seasonal Skill Metrics) illustrates the performance of MSWEP, CMORPH, IMERG, and GSMaP against the spatially averaged gauge observations across the UAE for each season. The results indicate substantial seasonal variability in accuracy, with notably better skill in the cool and transition seasons (DJF and MAM) than in the hot, dry summer (JJA).

Winter (DJF): CMORPH and GSMaP exhibit higher correspondence with gauge data, reflected by R² values above 0.5 and positive Kling–Gupta Efficiency (KGE) scores, suggesting better representation of stratiform frontal rainfall. IMERG, although showing a relatively high R², demonstrates pronounced positive bias, consistent with its known tendency to overestimate wetness in low-latitude arid regions where light rainfall and shallow cloud systems challenge retrieval accuracy.
Spring (MAM): CMORPH again performs best, achieving the highest R² (~0.74) and KGE (~0.61), with minimal bias (−3.3%). This season marks the transition from frontal to convective rainfall, and CMORPH’s morphing technique appears more adept at capturing spatially shifting convective cells than other datasets.
Summer (JJA): All SPPs exhibit degraded performance, with low R² and negative KGE values, as rainfall events during this period are rare, localized, and often below sensor detection thresholds. The inflated biases are largely due to the low denominator effect in percentage metrics, which amplifies relative errors when small gauge totals are used.
Autumn (SON): GSMaP shows the highest agreement with gauges (R² ≈ 0.79; KGE ≈ 0.84), demonstrating its robust detection of early convective storms and its adaptive gauge calibration scheme that improves retrieval precision during transitional months.

Overall, CMORPH dominates in the cooler half of the year, while GSMaP outperforms during the warmer, convective-dominated months. MSWEP, despite being gauge-corrected, performs less consistently, potentially due to the sparse ground network used in its bias correction at coarse spatial scales.

Similarly, Figure 11 compares GSMaP and IMERG scatter relationships with gauge observations for DJF and MAM. Both datasets show noticeable scatter around the 1:1 reference line, with IMERG generally producing higher rainfall estimates than GSMaP.

In DJF, GSMaP closely follows the 1:1 line at moderate rainfall magnitudes (0–15 mm day⁻¹), indicating its better calibration under stable stratiform rainfall conditions. IMERG, however, overestimates high-intensity events, with points deviating above the reference line. This suggests that the IMERG retrieval algorithm tends to misclassify thin, cold cloud-tops as rain-bearing systems, resulting in spurious precipitation.

In MAM, both products show wider dispersion, but CMORPH (as inferred from the skill metrics) and GSMaP perform more realistically than IMERG, which continues to overestimate totals beyond 20 mm day⁻¹. The overprediction by IMERG may stem from its precipitation-phase detection scheme, which relies heavily on infrared data in regions with limited passive microwave overpasses. GSMaP, by contrast, applies an adaptive Kalman filter with ground-based adjustments, enhancing its ability to match the magnitude and frequency of observed rainfall during transitional seasons.

Figure 12 highlights how QPE algorithms respond to distinct meteorological systems in the UAE, represented by two contrasting rainfall events: (a) 11 January 2020 (DJF frontal system) and (b) 14 August 2013 (JJA convective storm). The three performance indicators—RMSE, MAE, and KGE—clearly distinguish between these rainfall regimes, illustrating both the strengths and inherent weaknesses of satellite precipitation products in arid, topographically complex regions.

During Event (a), a widespread frontal system brought uniform, stratiform rainfall across much of the country. Under such conditions, the four QPEs showed strong convergence toward gauge observations, as reflected in low RMSE and MAE and relatively high KGE. The performance consistency, particularly of GSMaP and CMORPH, can be attributed to their microwave-driven retrieval schemes and effective gauge calibration, which are well suited to detecting large-scale, thermodynamically stable precipitation systems. IMERG also performed satisfactorily but tended to slightly overestimate low-intensity rainfall, a common outcome of its infrared-based gap-filling algorithms. In contrast, MSWEP’s coarser spatial resolution resulted in moderate underestimation, though its overall pattern remained consistent with the gauge network.

By comparison, Event (b), associated with an intense, short-lived convective burst over the Hajar Mountains, revealed the limitations of all QPEs. RMSE and MAE increased sharply while KGE values dropped, showing widespread misrepresentation of the event’s magnitude and distribution. The convective nature of JJA storms—characterized by small spatial footprints, high rainfall intensity, and rapid evolution—poses a fundamental challenge for satellite algorithms with limited temporal revisit frequency. Microwave overpasses often miss the storm core, while IR-based proxies misclassify non-precipitating clouds as rainfall. Even GSMaP and IMERG, which incorporate near-real-time calibration, struggled to track the localized peaks observed in gauges. CMORPH’s smoothing technique further weakened its responsiveness to such transient events.

3.8. Spatial Variations in QPE Performance and Physiographic Controls

A more analytical interpretation of the spatial patterns in QPE performance reveals that physiography and rainfall-generation mechanisms play a central role in shaping retrieval accuracy across the UAE. The mountainous stations, such as Jabal Jais, Jabal Hafeet, Jabal Mebreh, Masafi, and Al-Tawiyen, consistently show larger discrepancies between gauge observations and satellite-based estimates. These regions experience highly localized convective bursts and orographic uplift, which produce sharp spatial rainfall gradients over short distances. Such dynamics challenge satellite retrieval algorithms, particularly those relying on infrared and passive microwave signals, because warm-cloud precipitation and shallow convective towers often lack strong thermal contrast. As a result, CMORPH and IMERG tend to underestimate peak intensities in these high-elevation zones, while GSMaP typically displays greater variability due to its motion-vector tracking and gauge-adjustment processes that become less stable in steep terrain. In inland lowland regions, rainfall intensity is typically weaker and dominated by warm-cloud processes, which leads IMERG to systematically overestimate precipitation due to its sensitivity to thin cloud layers. MSWEP performs better overall but still inherits biases from its coarse global gauge correction framework, which cannot fully capture the hyper-local spatial variability of rainfall in arid environments. Coastal stations, such as Fujairah Airport, Fujairah Port, Hatta, and Dhudna, exhibit comparatively higher agreement among QPEs because rainfall events along the coast are often produced by broader synoptic-scale systems that enhance microwave detectability and spatial consistency across retrieval algorithms. These patterns confirm that QPE accuracy in the UAE is not solely a function of sensor performance; it is strongly modulated by atmospheric dynamics, microphysical characteristics of rainfall, and terrain complexity. Section 3 now emphasizes these physiographic controls to explain why certain datasets perform well in one region but poorly in another.

4. Discussion

Accurate estimation of precipitation is crucial for water resource management, flood prediction, and climate change adaptation, particularly in arid environments like the United Arab Emirates (UAE). In this study, we evaluated the performance of the MSWEP precipitation dataset against gauge observations and three commonly used satellite-based precipitation products (CMORPH, IMERG, and GSMaP) using a range of statistical, categorical, and extreme precipitation indices. Our findings provide valuable insights into the ability of these products to capture the spatial and temporal variability of precipitation in a region characterized by erratic and low-intensity rainfall events. The statistical measures, including correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), Kling-Gupta Efficiency (KGE), and Nash-Sutcliffe Efficiency (NSE), provided a comprehensive assessment of the products’ accuracy. The results revealed that MSWEP exhibited a relatively higher correlation with gauge observations compared to CMORPH and GSMaP, but slightly lower than IMERG. However, in terms of RMSE and MAE, MSWEP performed better than IMERG, indicating lower absolute deviations from gauge-based precipitation estimates. KGE values showed that MSWEP had more balanced performance in terms of variability and bias, while NSE values confirmed that MSWEP is a reliable product for representing precipitation variability in the region. These findings align with previous studies that highlighted the ability of high-resolution satellite products such as IMERG to capture precipitation patterns in arid regions but also emphasized their limitations in accurately estimating extreme precipitation events [8,16].

The categorical indices further confirmed the strengths and weaknesses of each product. MSWEP demonstrated a higher probability of detection (POD) compared to CMORPH and GSMaP but was slightly outperformed by IMERG, which has been previously recognized for its improved detection capabilities [29]. However, MSWEP exhibited a higher false alarm ratio (FAR), suggesting that it sometimes overestimates rainfall event occurrence, which could be attributed to its data-merging approach. The critical success index (CSI) showed that MSWEP provided a better balance between detection and false alarms, while the frequency bias index (FBI) indicated a tendency of IMERG and GSMaP to overestimate precipitation events, a pattern observed in previous studies evaluating satellite products over arid region [26]. Extreme precipitation indices, such as consecutive dry days (CDD), consecutive wet days (CWD), Rx1 day, Rx5 day, R10 mm, R20 mm, R30 mm, PRCPTOT, and R95P, further highlighted the discrepancies between the satellite products and gauge observations. MSWEP performed reasonably well in capturing dry and wet periods, with results similar to those found for CMORPH and GSMaP, though slight overestimation of wet periods was observed. The 95th percentile precipitation (95PP) analysis indicated that all satellite products struggled to capture extreme rainfall accurately, with GSMaP and IMERG showing the highest overestimations. These findings are consistent with Wehbe et al. [15], who reported that satellite precipitation products tend to overestimate rainfall in regions with complex terrain and low rainfall frequency.

While gauge observations serve as the benchmark for validating satellite precipitation products, they are not entirely error-free, particularly in arid regions like the UAE, where sparse networks and localized rainfall can limit spatial representativeness. Measurement uncertainties arising from wind undercatch, evaporation losses, or sensor calibration may partly contribute to discrepancies between gauge and satellite estimates. Despite applying rigorous quality control and data verification, these inherent limitations should be acknowledged as a source of uncertainty when interpreting QPE performance.

This seasonal evaluation yields several essential insights into how satellite precipitation products behave under varying hydroclimatic conditions. The performance of these datasets is influenced by the type of rainfall and the prevailing atmospheric processes that dominate each season. The analysis clearly shows that satellite retrieval accuracy is not constant throughout the year; instead, it fluctuates in response to changes in cloud structure, rainfall intensity, and background temperature. CMORPH and GSMaP tend to perform better because their algorithms are better suited to stratiform and mixed precipitation systems that dominate the cooler months, whereas IMERG often struggles with light rainfall or warm-cloud events typical of the UAE’s arid climate.

From a hydrological modeling perspective, these seasonal variations are crucial. Using uncorrected IMERG data, for instance, could lead to overestimated runoff or unrealistic hydrological responses during the winter and spring seasons due to its tendency to overpredict rainfall intensity. In contrast, GSMaP and CMORPH provide more balanced rainfall estimates that better match the observed gauge data, making them more suitable for rainfall–runoff modeling and flash flood prediction in data-scarce environments. The relatively moderate performance of MSWEP highlights another limitation—the reliance on coarse global gauge datasets for bias correction, which may not capture the localized rainfall patterns of the UAE’s highly variable terrain. Integrating regional rain gauge networks into such global products would likely enhance their representativeness and reliability. This aligns with [8] who emphasized the importance of using multiple hydro-climatological measures to assess satellite precipitation accuracy. The seasonal bias trends further indicated that MSWEP generally performed well across all seasons, with biases more pronounced in winter and spring, similar to the patterns observed in previous assessments of satellite-based products in the UAE [41].

In broader evaluations of arid and semi-arid regions, the performance of the MSWEP dataset has shown comparable behavior to our findings in the UAE. For example, ref. [49] reported that MSWEP v2.2 achieved one of the highest Kling-Gupta Efficiency (KGE) values across West Africa, outperforming most satellite-only estimates, though its performance varied seasonally depending on gauge density. Similarly, ref. [50] found that in the semi-arid tropics of Tamil Nadu, India, MSWEP provided consistent climatological accuracy but tended to underestimate high-intensity convective rainfall, mirroring our observations of underrepresentation of extremes. Furthermore, ref. [51] demonstrated that over the complex terrain of the Hengduan Mountains in China, MSWEP effectively captured the spatial distribution of mean precipitation but struggled to represent extreme events accurately, consistent with the overestimation tendencies of other satellite datasets such as IMERG and GSMaP observed in our study. These regional comparisons reinforce that while MSWEP’s multi-source merging framework ensures relatively stable performance for light to moderate rainfall across diverse dry and topographically variable environments, it remains sensitive to extreme rainfall detection. It would benefit from region-specific bias correction and gauge integration to improve hydrological applicability in arid zones. In summary, while MSWEP exhibited commendable performance in capturing precipitation characteristics over the UAE, certain biases remain, particularly regarding the frequency and magnitude of extreme rainfall events. Compared to CMORPH and GSMaP, MSWEP demonstrated a better correlation with gauge observations, although IMERG outperformed it in some statistical and categorical measures. The findings of this study reaffirm the need for regional calibration of satellite precipitation products and highlight the importance of integrating multiple datasets to enhance precipitation estimates in arid regions. Future research should focus on bias correction techniques and multi-source precipitation merging to improve the reliability of precipitation estimates for hydrological applications in the UAE.

The performance of MSWEP over the UAE also aligns with findings from recent evaluations conducted in other arid and semi-arid regions, highlighting both its strengths and limitations in data-sparse environments. Ref. [52] demonstrated that MSWEP V2 performs competitively in hydrological simulations across China’s diverse climate zones, ranking second only to GPCC for extreme precipitation indices. Importantly, their study showed that all precipitation datasets, including MSWEP, tend to underestimate annual maximum 1-day and 5-day precipitation while overestimating R95p in dry northwestern China, a pattern consistent with the underestimation and spread we observed for Rx1 day, Rx5 day, and R95P in the UAE. Similarly, ref. [53] reported that MSWEP v2.8 achieved strong performance in Pakistan’s low-elevation arid regions, but its accuracy deteriorated sharply in high-elevation mountainous areas where all global precipitation products exhibited large overestimations. This behavior mirrors our findings that MSWEP and other QPEs struggle over the UAE’s mountainous terrain due to highly localized convective rainfall and sparse gauge coverage. In the humid-to-semi-arid Huaihe River Basin, ref. [54] showed that MSWEP V2.1/V2.2 provides excellent temporal accuracy but weaker representation of spatial rainfall patterns and extreme daily events, particularly ≥100 mm/day, which again reflects the tendency of MSWEP in our study to capture seasonal variability well but underestimate spatial heterogeneity and peak extremes. Collectively, these studies confirm that while MSWEP is a reliable multi-source product for representing precipitation climatology and seasonal dynamics in arid settings, its performance for short-duration extremes and in heterogeneous terrain remains constrained, reinforcing the need for regional calibration, supplemental gauge networks, and multi-product integration for risk-sensitive hydrological applications in the UAE.

5. Conclusions

The accurate estimation of precipitation is vital for effective water resource management, climate studies, and hydrological modeling, particularly in arid environments like the UAE. This study comprehensively evaluated the performance of MSWEP against gauge observations and three widely used satellite-based precipitation products (CMORPH, IMERG, and GSMaP) using a suite of statistical, categorical, and extreme precipitation indices. The findings indicate that MSWEP performance over the UAE should be interpreted in a conditional manner. While it shows higher RMSE and MAE and lower KGE than GSMaP and CMORPH at most stations, indicating weaker skill in reproducing event-scale rainfall magnitude, it exhibits comparatively balanced rainfall occurrence characteristics, including high POD (~0.7), moderate FAR (~0.65), the highest median CSI (~0.35), and lower variability in FBI. MSWEP also reproduces rainfall intensity–frequency distributions, PRCPTOT, and R95P reasonably well, supporting its suitability for seasonal and climatological analyses. However, systematic underestimation of Rx1 day, Rx5 day, and heavy rainfall days (>20 mm) limits its applicability for short-duration extreme events and hazard-focused applications. Clear distinction between skill in seasonal accumulation, rainfall detection, and extreme magnitude is therefore essential when interpreting MSWEP reliability in hyper-arid environments.

Despite its overall effectiveness, MSWEP, like other satellite-based products, faces limitations in capturing extreme precipitation events, as seen in the analysis of the 95th percentile precipitation and extreme precipitation indices. The underestimation of heavy rainfall and overestimation of light rainfall remain challenges that must be addressed. Additionally, while MSWEP provides a reasonable representation of dry and wet spells, its performance varies across different locations, emphasizing the importance of localized validation efforts. The need for enhanced bias correction techniques and integration with ground-based observations remains a crucial step in improving satellite precipitation estimates in arid regions.

Looking forward, future research should focus on refining bias correction algorithms for MSWEP and other satellite-based products to improve their reliability in arid environments. The integration of high-resolution regional models, reanalysis datasets, and machine learning techniques could further enhance precipitation estimation accuracy. Additionally, multi-source data fusion approaches, combining gauge, radar, and satellite data, may provide more comprehensive insights into rainfall dynamics. Given the increasing importance of remote sensing for climate monitoring and water resource planning, continued validation and improvement of satellite precipitation products will be essential for addressing the challenges of precipitation estimation in data-scarce regions like the UAE.

This study contributes to the growing body of research assessing the reliability of satellite precipitation products in arid climates, underscoring the strengths and limitations of MSWEP in comparison to established datasets. The findings provide valuable insights for hydrologists, meteorologists, and policymakers working toward improved precipitation monitoring and water management strategies in the UAE and similar regions. In operational contexts, the evaluated products—particularly MSWEP can play a vital role in supporting water resource planning, flood forecasting, and drought monitoring in the UAE. The high temporal resolution and spatial completeness of MSWEP make it particularly useful for driving hydrological and climate models in data-scarce regions with limited gauge coverage. When bias-corrected and regionally calibrated, MSWEP can serve as a reliable input for early warning systems and long-term water management strategies, enhancing preparedness and resilience to hydroclimatic extremes in arid environments. Future advancements in satellite retrieval algorithms and the incorporation of advanced statistical corrections using machine learning will be key to further enhancing the applicability of satellite precipitation data for hydrological and climatological studies. Moreover, integration of the MSWEP satellite data with the local radar dataset may improve the accuracy of the in capturing the extreme events in the country.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs18010095/s1, Table S1: Station-wise comparison of statistical measures used to assess the performance of MSWEP in the study.

Author Contributions

Conceptualization, A.S.A. and F.B.; methodology, H.A.A.; software, F.B. and M.E.; validation, A.S.A., F.B. and M.E.; formal analysis, A.S.A. and H.A.A.; investigation, F.B.; resources, M.E.; data curation, F.B. and M.E.; writing—original draft preparation, A.S.A. and H.A.A.; writing—review and editing, F.B.; visualization, A.S.A. and H.A.A.; supervision, M.S.; project administration, M.S.; funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

No funding was provided for this study.

Data Availability Statement

The gauge data used in this study is confidential and cannot be shared publicly. Satellite product data can be freely downloaded from the links provided in the Materials and Methods section.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Anagnostou, E.N.; Maggioni, V.; Nikolopoulos, E.I.; Meskele, T.; Hossain, F.; Papadopoulos, A. Benchmarking High-Resolution Global Satellite Rainfall Products to Radar and Rain-Gauge Rainfall Estimates. IEEE Trans. Geosci. Remote Sens. 2010, 48, 1667–1683. [Google Scholar] [CrossRef]
Zhang, T.; Liang, Z.; Wang, H.; Wang, J.; Hu, Y.; Li, B. Merging multisatellite precipitation products using stacking method and the censored-shifted gamma ensemble model output statistics in china’s Beimiaoji basin. J. Hydrol. 2023, 618, 129263. [Google Scholar] [CrossRef]
Mondal, A.; Lakshmi, V.; Hashemi, H. Intercomparison of trend analysis of Multisatellite Monthly Precipitation Products and Gauge Measurements for River Basins of India. J. Hydrol. 2018, 565, 779–790. [Google Scholar] [CrossRef]
Zhang, L.; Li, X.; Zheng, D.; Zhang, K.; Ma, Q.; Zhao, Y.; Ge, Y. Merging multiple satellite-based precipitation products and gauge observations using a novel double machine learning approach. J. Hydrol. 2021, 594, 125969. [Google Scholar] [CrossRef]
Leganés, L.J.; Navarro, A.; Lee, G.; Martín, R.; Kidd, C.; Tapiador, F.J. TRMM-era neural networks for GPM-era satellite quantitative precipitation estimation (QPE). Atmos. Res. 2025, 315, 107879. [Google Scholar] [CrossRef]
Biswas, S.; Singh, C.; Bharti, V. An assessment of GPM IMERG Version 7 rainfall estimates over the North West Himalayan region. Atmos. Res. 2025, 315, 107910. [Google Scholar] [CrossRef]
Chen, H.; Wen, D. Dependency of errors for four global reanalysis and satellite precipitation estimates on four crucial factors. Atmos. Res. 2023, 296, 107076. [Google Scholar] [CrossRef]
Baig, F.; Abrar, M.; Chen, H.; Sherif, M. Rainfall Consistency, Variability, and Concentration over the UAE: Satellite Precipitation Products vs. Rain Gauge Observations. Remote Sens. 2022, 14, 5827. [Google Scholar] [CrossRef]
Bisht, D.S.; Kumar, D.P.; Amarjyothi, K.; Saha, U. Bias correction of satellite precipitation estimates using Mumbai-MESONET observations: A Random Forest approach. Atmos. Res. 2025, 315, 107858. [Google Scholar] [CrossRef]
Chen, C.; He, Q.; Li, Y. Downscaling and merging multiple satellite precipitation products and gauge observations using random forest with the incorporation of spatial autocorrelation. J. Hydrol. 2024, 632, 130919. [Google Scholar] [CrossRef]
Elkollaly, M.; Sefelnasr, A.; Sherif, M. A comprehensive investigation of event-based rainfall analysis in arid and semi-arid climates: An integration of innovative storm volume-duration-frequency (SVDF) schemes and Event-Based dimensionless hyetographs (EDH). J. Hydrol. 2025, 655, 132959. [Google Scholar] [CrossRef]
Gado, T.A.; Elkollaly, M.; Guo, Y.; El-Hagrsy, R.M.; Mohameden, M.B.; Shalaby, B.A.; Elboshy, B.; Omara, H.; ElSawwaf, M.A. Event-based rainfall analysis in Sinai, Egypt. Hydrol. Sci. J. 2024, 69, 622–638. [Google Scholar] [CrossRef]
Gentilucci, M.; Bufalini, M.; D’Aprile, F.; Materazzi, M.; Pambianchi, G. Comparison of Data from Rain Gauges and the IMERG Product to Analyse Precipitation in Mountain Areas of Central Italy. ISPRS Int. J. Geo-Inf. 2021, 10, 795. [Google Scholar] [CrossRef]
Bhardwaj, A.; Ziegler, A.D.; Wasson, R.J.; Chow, W.T.L. Accuracy of rainfall estimates at high altitude in the Garhwal Himalaya (India): A comparison of secondary precipitation products and station rainfall measurements. Atmos. Res. 2017, 188, 30–38. [Google Scholar] [CrossRef]
Wehbe, Y.; Ghebreyesus, D.; Temimi, M.; Milewski, A.; Al Mandous, A. Assessment of the consistency among global precipitation products over the United Arab Emirates. J. Hydrol. Reg. Stud. 2017, 12, 122–135. [Google Scholar] [CrossRef]
Hussein, K.A.; Alsumaiti, T.S.; Ghebreyesus, D.T.; Sharif, H.O.; Abdalati, W. High-Resolution Spatiotemporal Trend Analysis of Precipitation Using Satellite-Based Products over the United Arab Emirates. Water 2021, 13, 2376. [Google Scholar] [CrossRef]
Sharifi, E.; Steinacker, R.; Saghafian, B. Assessment of GPM-IMERG and Other Precipitation Products against Gauge Data under Different Topographic and Climatic Conditions in Iran: Preliminary Results. Remote Sens. 2016, 8, 135. [Google Scholar] [CrossRef]
de Almeida, C.T.; Delgado, R.C.; de Oliveira Junior, J.F.; Gois, G.; Cavalcanti, A.S. Avaliação das Estimativas de Precipitação do Produto 3B43-TRMM do Estado do Amazonas. Floresta E. Ambiente 2015, 22, 279–286. [Google Scholar] [CrossRef]
Huffman, G.J.; Bolvin, D.T.; Nelkin, E.J.; Wolff, D.B.; Adler, R.F.; Gu, G.; Hong, Y.; Bowman, K.P.; Stocker, E.F. The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-Global, Multiyear, Combined-Sensor Precipitation Estimates at Fine Scales. J. Hydrometeorol. 2007, 8, 38–55. [Google Scholar] [CrossRef]
Li, Y.; Yan, H.; Chen, L.; Huang, M.; Shou, W.; Zhu, L.; Zhao, L.; Xing, Y. Performance and uncertainties of five popular satellite-based precipitation products in drought monitoring for different climate regions. J. Hydrol. 2024, 628, 130562. [Google Scholar] [CrossRef]
Amjad, M.; Yilmaz, M.T.; Yucel, I.; Yilmaz, K.K. Performance evaluation of satellite- and model-based precipitation products over varying climate and complex topography. J. Hydrol. 2020, 584, 124707. [Google Scholar] [CrossRef]
Hinge, G.; Hamouda, M.A.; Long, D.; Mohamed, M.M. Hydrologic utility of satellite precipitation products in flood prediction: A meta-data analysis and lessons learnt. J. Hydrol. 2022, 612, 128103. [Google Scholar] [CrossRef]
Lemma, E.; Upadhyaya, S.; Ramsankaran, R. Meteorological drought monitoring across the main river basins of Ethiopia using satellite rainfall product. Environ. Syst. Res. 2022, 11, 7. [Google Scholar] [CrossRef]
Huffman, G.; Bolvin, D.; Braithwaite, D.; Hsu, K.; Joyce, R.; Xie, P. NASA Global Precipitation Measurement (GPM) Integrated Multi-satellitE Retrievals for GPM (IMERG). In Global Precipitatoin Measurement; NASA: Washington, DC, USA, 2018. [Google Scholar]
Derin, Y.; Kirstetter, P.-E.; Brauer, N.; Gourley, J.J.; Wang, J. Evaluation of IMERG Satellite Precipitation over the Land–Coast–Ocean Continuum. Part II: Quantification. J. Hydrometeorol. 2022, 23, 1297–1314. [Google Scholar] [CrossRef]
Alsumaiti, T.S.; Hussein, K.; Ghebreyesus, D.T.; Sharif, H.O. Performance of the CMORPH and GPM IMERG Products over the United Arab Emirates. Remote Sens. 2020, 12, 1426. [Google Scholar] [CrossRef]
Chaudhary, S.; Dhanya, C.T. Decision tree-based reduction of bias in monthly IMERG satellite precipitation dataset over India. H₂Open J. 2020, 3, 236–255. [Google Scholar] [CrossRef]
Li, R.; Guilloteau, C.; Kirstetter, P.-E.; Foufoula-Georgiou, E. How well does the IMERG satellite precipitation product capture the timing of precipitation events? J. Hydrol. 2023, 620, 129563. [Google Scholar] [CrossRef]
Mahmoud, M.T.; Hamouda, M.A.; Mohamed, M.M. Spatiotemporal evaluation of the GPM satellite precipitation products over the United Arab Emirates. Atmos. Res. 2019, 219, 200–212. [Google Scholar] [CrossRef]
Joyce, R.J.; Janowiak, J.E.; Arkin, P.A.; Xie, P. CMORPH: A Method that Produces Global Precipitation Estimates from Passive Microwave and Infrared Data at High Spatial and Temporal Resolution. J. Hydrometeorol. 2004, 5, 487–503. [Google Scholar] [CrossRef]
Kim, J.; Han, H. Evaluation of the CMORPH high-resolution precipitation product for hydrological applications over South Korea. Atmos. Res. 2021, 258, 105650. [Google Scholar] [CrossRef]
Ning, S.; Ma, C.; Chen, R.; Bai, S. E_GSMaP precipitation dataset reforecasted by RF-WMRA: Description and validation. Sci. Total Environ. 2025, 958, 177963. [Google Scholar] [CrossRef]
Beck, H.E.; Wood, E.F.; Pan, M.; Fisher, C.K.; Miralles, D.G.; van Dijk, A.I.J.M.; McVicar, T.R.; Adler, R.F. MSWEP V2 Global 3-Hourly 0.1° Precipitation: Methodology and Quantitative Assessment. Bull. Am. Meteorol. Soc. 2019, 100, 473–500. [Google Scholar] [CrossRef]
Beck, H.E.; Vergopolan, N.; Pan, M.; Levizzani, V.; van Dijk, A.I.J.M.; Weedon, G.P.; Brocca, L.; Pappenberger, F.; Huffman, G.J.; Wood, E.F. Global-scale evaluation of 22 precipitation datasets using gauge observations and hydrological modeling. Hydrol. Earth Syst. Sci. 2017, 21, 6201–6217. [Google Scholar] [CrossRef]
Beck, H.E.; Pan, M.; Roy, T.; Weedon, G.P.; Pappenberger, F.; van Dijk, A.I.J.M.; Huffman, G.J.; Adler, R.F.; Wood, E.F. Daily evaluation of 26 precipitation datasets using Stage-IV gauge-radar data for the CONUS. Hydrol. Earth Syst. Sci. 2019, 23, 207–224. [Google Scholar] [CrossRef]
Baig, F.; Ali, L.; Faiz, M.A.; Chen, H.; Sherif, M. From bias to accuracy: Transforming satellite precipitation data in arid regions with machine learning and topographical insights. J. Hydrol. 2025, 653, 132801. [Google Scholar] [CrossRef]
Chen, H.; Chandrasekar, V.; Tan, H.; Cifelli, R. Rainfall Estimation From Ground Radar and TRMM Precipitation Radar Using Hybrid Deep Neural Networks. Geophys. Res. Lett. 2019, 46, 10669–10678. [Google Scholar] [CrossRef]
Wodebo, D.Y.; Melesse, A.M.; Woldesenbet, T.A.; Mekonnen, K.; Amdihun, A.; Korecha, D.; Tedla, H.Z.; Corzo, G.; Teshome, A. Comprehensive performance evaluation of satellite-based and reanalysis rainfall estimate products in Ethiopia: For drought, flood, and water resources applications. J. Hydrol. Reg. Stud. 2025, 57, 102150. [Google Scholar] [CrossRef]
Tang, G.; Clark, M.P.; Papalexiou, S.M.; Ma, Z.; Hong, Y. Have satellite precipitation products improved over last two decades? A comprehensive comparison of GPM IMERG with nine satellite and reanalysis datasets. Remote Sens. Environ. 2020, 240, 111697. [Google Scholar] [CrossRef]
Sun, R.; Yuan, H.; Yang, Y. Using multiple satellite-gauge merged precipitation products ensemble for hydrologic uncertainty analysis over the Huaihe River basin. J. Hydrol. 2018, 566, 406–420. [Google Scholar] [CrossRef]
Mahmoud, M.T.; Al-Zahrani, M.A.; Sharif, H.O. Assessment of global precipitation measurement satellite products over Saudi Arabia. J. Hydrol. 2018, 559, 1–12. [Google Scholar] [CrossRef]
Baig, F.; Abrar, M.; Chen, H.; Sherif, M. Evaluation of precipitation estimates from remote sensing and artificial neural network based products (PERSIANN) family in an arid region. Remote Sens. 2023, 15, 1078. [Google Scholar] [CrossRef]
Hamouda, M.A.; Hinge, G.; Yemane, H.S.; Al Mosteka, H.; Makki, M.; Mohamed, M.M. Reliability of GPM IMERG satellite precipitation data for modelling flash flood events in selected watersheds in the UAE. Remote Sens. 2023, 15, 3991. [Google Scholar] [CrossRef]
Mohammed, S.A.; Hamouda, M.A.; Mahmoud, M.T.; Mohamed, M.M. Performance of GPM-IMERG precipitation products under diverse topographical features and multiple-intensity rainfall in an arid region. Hydrol. Earth Syst. Sci. Discuss. 2020, 2020, 1–27. [Google Scholar]
Sefelnasr, A.; Ebraheem, A.A.; Faiz, M.A.; Shi, X.; Alghafli, K.; Baig, F.; Al-Rashed, M.; Alshamsi, D.; Ahamed, M.B.; Sherif, M. Enhancement of Groundwater Recharge from Wadi Al Bih Dam, UAE. Water 2022, 14, 3448. [Google Scholar] [CrossRef]
Sherif, M.; Almulla, M.; Shetty, A.; Chowdhury, R.K. Analysis of rainfall, PMP and drought in the United Arab Emirates. Int. J. Climatol. 2014, 34, 1318–1328. [Google Scholar] [CrossRef]
Ghelli, A. Verification of Categorical Predictands; American Meteorological Society (AMS): Boston, MA, USA, 2009. [Google Scholar]
Karl, T.R.; Nicholls, N.; Ghazi, A. CLIVAR/GCOS/WMO Workshop on Indices and Indicators for Climate Extremes Workshop Summary. In Weather and Climate Extremes; Karl, T.R., Nicholls, N., Ghazi, A., Eds.; Springer: Dordrecht, The Netherlands, 1999; pp. 3–7. [Google Scholar] [CrossRef]
Satgé, F.; Defrance, D.; Sultana, B.; Bonneta, M.-P.; Seyler, F.; Rouché, N.; Pierron, F. Evaluation of 23 gridded precipitation datasets across West Africa. J. Hydrol. 2020, 581, 124412. [Google Scholar] [CrossRef]
Saravanan, A.; Karthe, D.; Ramalingam, S.; Schütze, N. Evaluation of remote-sensing and reanalysis-based precipitation products for agro-hydrological studies in semi-arid tropics of Tamil Nadu. Hydrol. Earth Syst. Sci. 2024, 29, 4847–4870. [Google Scholar] [CrossRef]
Dong, W.; Wang, G.; Guo, L.; Sun, J.; Sun, X. Evaluation of three gridded precipitation products in characterizing extreme precipitation over the Hengduan Mountains region in China. Remote Sens. 2022, 14, 4408. [Google Scholar] [CrossRef]
Wan, Y.; Li, D.; Sun, J.; Wang, M.; Liu, H. Evaluation of six latest precipitation datasets for extreme precipitation estimates and hydrological application across various climate regions in China. Atmos. Res. 2025, 315, 107932. [Google Scholar] [CrossRef]
Abbas, H.; Song, W.; Wang, Y.; Xiang, K.; Chen, L.; Feng, T.; Linghu, S.; Alam, M. Validation of CRU TS v4.08, ERA5-Land, IMERG v07B, and MSWEP v2.8 precipitation estimates against observed values over Pakistan. Remote Sens. 2024, 16, 4803. [Google Scholar] [CrossRef]
Li, L.; Wang, Y.; Wang, L.; Hu, Q.; Zhu, Z.; Li, L.; Li, C. Spatio-temporal accuracy evaluation of MSWEP daily precipitation over the Huaihe River Basin, China: A comparison study with representative satellite- and reanalysis-based products. J. Geogr. Sci. 2022, 32, 2271–2290. [Google Scholar] [CrossRef]

Figure 1. Location of the study area (a), Gauge locations (b), and spatial distribution of total annual average rainfall (mm) (c).

Figure 2. Regional comparison of all QPEs. Metrics were normalized using min–max scaling across stations to allow comparative visualization.

Figure 3. Comparison of rainfall intensity–frequency distributions between gauge observations and four satellite-based QPE datasets (MSWEP, IMERG, CMORPH, and GSMaP) for the period 2004–2020.

Figure 4. Boxplots for the categorical indices’ representation. (a) Probability of Detection (POD); (b) False Alarm Ratio (FAR); (c) Critical Success Index (CSI); and (d) Frequency Biased Index (FBI).

Figure 5. Boxplots of extreme precipitation indices for gauge observations, MSWEP, and three satellite products (CMORPH, IMERG, and GSMaP). The indices include (a) Consecutive Dry Days (CDD), (b) Consecutive Wet Days (CWD), (c) Rx1 day (maximum daily rainfall), (d) Rx5 day (maximum 5-day rainfall), (e) R10 mm, (f) R20 mm, (g) R30 mm (days with rainfall exceeding 10, 20, and 30 mm, respectively), (h) PRCPTOT (total precipitation), and (i) R95P (rainfall above the 95th percentile).

Figure 6. Spatial comparison of precipitation characteristics between gauge observations and four satellite-based products (MSWEP, CMORPH, IMERG, and GSMaP). (A) Percentage of time with no precipitation, illustrating the frequency of dry periods across the study region. (B) 95th percentile precipitation (95PP), representing the threshold for extreme rainfall events at each station. Color scales indicate variations in precipitation characteristics, with red representing higher values and blue representing lower values.

Figure 7. (a) Total annual average rainfall recorded by gauge observations and four satellite-based precipitation products (MSWEP, CMORPH, IMERG, and GSMaP) from 2004 to 2020. The black horizontal line represents the long-term mean annual rainfall of 100.6 mm. (b) Rainfall departures from normal conditions, showing deviations from the long-term mean for each dataset. Positive values indicate wetter-than-normal years, while negative values represent drier-than-normal years.

Figure 8. Aggregated heatmap of the average number of rainfall days categorized into light (<5 mm), moderate (5–20 mm), and heavy (>20 mm) precipitation events for the gauge data and four satellite products (MSWEP, CMORPH, IMERG, and GSMaP). The color intensity represents the frequency of rainfall occurrences, with warmer colors indicating a higher number of detected rainfall days.

Figure 9. Boxplots illustrating the distribution of rainfall event counts across GAUGE, MSWEP, CMORPH, IMERG, and GSMaP datasets. The interquartile range (IQR) is represented by the shaded boxes, with the median shown as a horizontal line within each box. Whiskers indicate the minimum and maximum values within 1.5 times the IQR, while outliers are marked as individual points.

Figure 10. Seasonal skill metrics (R², RMSE, MAE, and KGE) of four satellite precipitation products (MSWEP, CMORPH, IMERG, and GSMaP) compared with spatially averaged gauge observations over the UAE for the period 2004–2020.

Figure 11. Seasonal scatter plots comparing GSMaP and IMERG against gauge observations for winter (DJF) and spring (MAM) seasons. The 1:1 reference line highlights deviations in rainfall estimation.

Figure 12. Comparison of precipitation products for distinct meteorological events. (a) RMSE; (b) MAE; and (c) KGE.

Table 1. Details of the four satellite precipitation products used in the study (2004–2020).

Product Name	Source	Spatial Resolution	Temporal Resolution	Global Coverage	Data Span
MSWEP V2.8	GloH20	0.1° (~10 km)	3-hourly	Global	1979-present
CMORPH-CRT	Climate Prediction Center (CPC)	0.25° (~25 km)	Half-hourly	Global (60°N to 60°S)	1998-present
IMERG-Final	NASA Global Precipitation Measurement (GPM) Mission	0.1° (~10 km)	Half-hourly	Global (90°N to 90°S)	2000-present
GSMaP-NRT	Japan Aerospace Exploration Agency (JAXA)	0.1° (~10 km)	Hourly	Global (60°N to 60°S)	1998-present

Table 2. Details of all measures used to assess the performance of MSWEP in the study.

Measure (s)	Formula	Range	Perfect Value
Root Mean Square Error(mm)	$R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2}}{n}}$	-	0
Mean Absolute Error	$M A E = \frac{1}{n} \sum_{i = 1}^{n} \|y_{i} - x_{i}\|$	-	0
Kling Gupta Efficiency	$K G E = 1 - \sqrt{{(r - 1)}^{2} + {(\frac{σ_{s i m}}{σ_{o b s}} - 1)}^{2} + {(\frac{μ_{s i m}}{μ_{o b s}} - 1)}^{2}}$	−∞ to 1	1
Probability of Detection	$P O D = \frac{A}{A + C}$	0 to 1	1
False Alarm Ratio	$F A R = \frac{C}{A + B}$	0 to 1	0
Critical Success	$C S I = \frac{A}{A + B + C}$	0 to 1	1
Rx1 day	Maximum 1-day precipitation over a given period	mm
R10 mm	Yearly days count when Rainfall ≥ 10 mm	days
R20 mm	Yearly days count when Rainfall ≥ 20 mm	days
R20 mm	Yearly days count when Rainfall ≥ 30 mm	days
CWD	Maximum length of wet spell. Max number of continuous days when Rainfall ≥ 1 mm	days
CDD	Maximum length of dry spell indicates maximum number of continuous days when rainfall ≤ 1	days
PRCPTOT	Annual total precipitation on wet days	mm
R95P	Annual total precipitation when rainfall > 95th percentile	mm

Where A (Hits): Number of events correctly detected by the QPE (rain observed and rain predicted). B (False Alarms): Number of events predicted as rain by the QPE but not observed by gauges. C (Misses): Number of rainfall events observed by gauges but not detected by the QPE.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abdelrazaq, A.S.; Alnuaimi, H.A.; Baig, F.; Elkollaly, M.; Sherif, M. Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements. Remote Sens. 2026, 18, 95. https://doi.org/10.3390/rs18010095

AMA Style

Abdelrazaq AS, Alnuaimi HA, Baig F, Elkollaly M, Sherif M. Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements. Remote Sensing. 2026; 18(1):95. https://doi.org/10.3390/rs18010095

Chicago/Turabian Style

Abdelrazaq, Abdulrahman Saeed, Humaid Abdulla Alnuaimi, Faisal Baig, Mohamed Elkollaly, and Mohsen Sherif. 2026. "Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements" Remote Sensing 18, no. 1: 95. https://doi.org/10.3390/rs18010095

APA Style

Abdelrazaq, A. S., Alnuaimi, H. A., Baig, F., Elkollaly, M., & Sherif, M. (2026). Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements. Remote Sensing, 18(1), 95. https://doi.org/10.3390/rs18010095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Benchmarking MSWEP Precipitation Accuracy in Arid Zones Against Traditional and Satellite Measurements

Highlights

Abstract

1. Introduction

Study Area

2. Materials and Methods

2.1. Data Sources

2.2. Data Extraction

2.3. Data Pre-Processing

2.4. Statistical Measures

2.5. Categorical Metrics

2.6. Extreme Rainfall Indices

3. Results

3.1. Statistical Measures Evaluation

3.2. Rainfall Intensity Frequency Comparison

3.3. Categorical and Extreme Indices

3.4. Extreme Rainfall Indices Evaluation

3.5. Qualitative Measures

3.6. Quantitative Measures

3.7. Seasonal Analysis

3.8. Spatial Variations in QPE Performance and Physiographic Controls

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI