How Do Different Precipitation Products Perform in a Dry-Climate Region?

Brobst-Whitcomb, Noelle; Maggioni, Viviana

doi:10.3390/atmos17010005

Open AccessEditor’s ChoiceArticle

How Do Different Precipitation Products Perform in a Dry-Climate Region?

by

Noelle Brobst-Whitcomb

and

Viviana Maggioni

^*

Department of Civil, Environmental & Infrastructure Engineering, George Mason University, Fairfax, VA 22030, USA

^*

Author to whom correspondence should be addressed.

Atmosphere 2026, 17(1), 5; https://doi.org/10.3390/atmos17010005

Submission received: 6 November 2025 / Revised: 12 December 2025 / Accepted: 18 December 2025 / Published: 20 December 2025

(This article belongs to the Special Issue Compound Extreme Events in a Changing Climate: Atmospheric Mechanisms and Hydrological Consequences)

Download

Browse Figures

Versions Notes

Abstract

Dry climate regions face heightened risks of flooding and infrastructure damage even with minimal rainfall. Climate change is intensifying this vulnerability by increasing the duration, frequency, and intensity of precipitation events in areas that have historically experienced arid conditions. As a result, accurate precipitation estimation in these regions is critical for effective planning, risk mitigation, and infrastructure resilience. This study evaluates the performance of five satellite- and model-based precipitation products by comparing them against in situ rain gauge observations in a dry-climate region: The fifth generation European Centre for Medium-Range Weather Forecasts Reanalysis (ERA5) (analyzing maximum and minimum precipitation rates separately), the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA2), the Western Land Data Assimilation System (WLDAS), and the Integrated Multi-satellite Retrievals for Global Precipitation Measurement (IMERG). The analysis focuses on both average daily rainfall and extreme precipitation events, with particular attention to precipitation magnitude and the accuracy of event detection, using a combination of statistical metrics—including bias ratio, mean error, and correlation coefficient—as well as contingency statistics such as probability of detection, false alarm rate, missed precipitation fraction, and false precipitation fraction. The study area is Palm Desert, a mountainous, arid, and urban region in Southern California, which exemplifies the challenges faced by dry regions under changing climate conditions. Among the products assessed, WLDAS ranked highest in measuring total precipitation and extreme rainfall amounts but performed the worst in detecting the occurrence of both average and extreme rainfall events. In contrast, IMERG and ERA5-MIN demonstrated the strongest ability to detect the timing of precipitation, though they were less accurate in estimating the magnitude of rainfall per event. Overall, this study provides valuable insights into the reliability and limitations of different precipitation estimation products in dry regions, where even small amounts of rainfall can have disproportionately large impacts on infrastructure and public safety.

Keywords:

precipitation; dry climate; remote sensing; error analysis

1. Introduction

Evidence of climate change is present globally [1]. Increases in surface temperature are greatly affecting the hydrological cycle from the local to the regional and global scale, ultimately leading to increased intensity and frequency of precipitation [2]. This increases the risk of flooding, which is the most frequent type of natural disaster and can substantially damage the affected areas [3]. The best way to mitigate damages from flooding is through adaptive measures that increase the resilience of current infrastructure to extreme events [4]. Gaining a solid understanding of how different volumes of precipitation will affect different areas together with a reliable method of predicting floods is crucial to increase community resilience to extreme hydroclimatic events. Climate change has rendered stationarity moot, so traditional prediction models may no longer be reliable or valid, meaning suddenly, infrastructure may no longer be resilient to current and future storm events [1].

Some areas with typically arid climate, such as Central Asia, are getting larger volumes of rain more frequently [5]. Compared to more temperate areas, the amount of increased rainfall may seem low or insignificant. However, even a small increase in precipitation, (especially if received during a limited amount of time, can have catastrophic effects on such regions [5,6]. As these areas receive increased precipitation or a higher frequency of extreme precipitation events, there are adverse effects on resource management, infrastructure, and livelihood. Such effects may include destruction (and/or disruption) of housing, roads, and resource management equipment. Other consequences of these changing precipitation patterns include too much water during some parts of the year and too little during others, which can affect the availability of water year-round. In Central Asia, there is a limited rainy season that serves as the primary source of water for the region. So, any changes to the volume and frequency of rainfall will significantly affect those who live there [5]. This area of Asia relies on a consistent wet season for their water supply, and if precipitation occurs outside of that time window, it causes runoff from snowy mountains to occur sooner than usual, which then shortens the wet season. Shortening the wet season ultimately shortens the growing and harvesting seasons for crops by decreasing the amount and availability of water, thereby limiting the crop yield in the region and impacting food availability for consumption or commerce [5].

Another arid area that is experiencing increased rainfall is Southern California in the United States. In August 2023, Hurricane Hilary devastated Southern California, which received between 102 mm and 153 mm of rain within three days. This may not seem significant compared to what is typically observed in other, temperate areas of the country, but it exceeded the daily and monthly records for the area, and caused significant damages to infrastructure including buildings, houses, and roads [7]. This area is not acclimated to receiving such large quantities of precipitation in such a short period of time, therefore meaning their infrastructure is not built to withstand these localized extreme events. This will only get worse as these types of events become more common.

Since so little difference in rainfall can have such a large impact in dry areas, it is even more important to minimize errors in precipitation estimates used in flood forecasting models in such regions. Accurate flood prediction models are fundamental for engineers and planners when building new infrastructure and planning management actions and the most critical input to such models is precipitation. Furthermore, precipitation measurements are used for an array of applications, including reservoir operations, land development, prevention of extreme hydroclimatic events (e.g., floods, landslides), weather and climate forecasting, and disease control [8,9]. However, an accurate measure of precipitation is crucial to effectively use such products in the applications listed above [10].

Precipitation is commonly measured by in situ gauges, weather radars, satellites, and re-analysis models. Ground-based instruments, including rain gauges and weather radars, are widely used for measuring precipitation [11]. Rain gauges provide high temporal frequency but are prone to errors from wind effects and evaporation [12]. Radar networks provide continuous coverage with high spatial and temporal resolution at regional scales. However, radar-based measurements are affected by errors due to various issues such as surface backscatter contamination, attenuation of the signal, and uncertainty of the reflectivity–rain-rate relationship [13,14,15].

Continuous and near-real-time coverage of the Earth can only be recorded with satellite precipitation sensors. The most accurate satellite precipitation estimates are from a combination of infrared (IR) sensors on geostationary satellites, characterized by high sampling frequency, and passive microwave (PMW) sensors on low-Earth-orbiting satellites with less-frequent sampling [16]. Unlike PMW sensors that collect data of emissions and scattering signals of raindrops, snow, and ice contents, IR data measure cloud-top temperatures and cloud heights [17].

Past efforts have evaluated and utilized satellite-based observations in a suite of hydrologic applications [18,19,20,21,22]. A few focused on dry climate areas. For example, Morin et al. (2020) analyzed precipitation climatology from satellite observations in dry regions of the world and concluded that these areas are characterized not only by lower annual precipitation and higher variability, but also by fewer rainy days, a more pronounced extreme tail in the precipitation distribution, a smaller proportion of the area experiencing rainfall, and shorter spatial correlation distances [23]. The study by Serrat-Capdevila et al. (2016) assessed three satellite precipitation products over Africa and found that their performance in dry regions was generally weaker due to infrequent and localized rainfall [24]. However, after applying a bias correction, their accuracy improved significantly. Nozarpour et al. (2024) performed an assessment of satellite precipitation estimation products over Iran, i.e., Integrated Multi-satellite Retrievals for GPM (IMERG-V6), Multi-Source Weighted-Ensemble Precipitation (MSWEP), Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA-3B43V7), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks—Climate Data Record (PERSIANN-CDR) [25]. Their study found that all products consistently had fewer errors in regions of Iran with lower precipitation rates. Another study validated remote sensing precipitation products in southern Spain by comparing them to measurements from ground stations [26]. They also developed a methodology to identify extreme rainfall and drought events over the past 30 years using satellite-derived data. Furthermore, Vernimmen et al. (2012) found that satellite rainfall products underestimate dry season rainfall in Indonesia, with TMPA-3B42 (near real time version) performing better than others [27]. Another recent work that evaluated three high-resolution satellite-based precipitation products over different Chinese basins showed that, while all perform well for monthly precipitation, their accuracy declines in dry/arid regions, especially when estimating extreme precipitation [28].

Reanalysis precipitation products are obtained by combining observational data, satellite measurements, and numerical weather prediction models, which are then processed to create a continuous and consistent time series. Reanalysis products typically cover the entire globe or large regional areas and span several decades, often from the mid-20th century to the present, making them particularly valuable for understanding long-term trends, variability, and extremes in precipitation patterns [18,29,30]. Past work has shown how the performance of such products varies by region and precipitation intensity and they often struggle to detect daily precipitation events in arid zones, underestimates moderate-to-heavy rainfall, and overestimates light rainfall [31,32]. For instance, a study using re-analysis data during 1979–2018 showed that global drylands experienced a significant overall decrease in precipitation, though some regions—southern Africa, Australia, northern Africa, and South Asia—saw increases in summer rainfall [33]. A work by Lavers et al. [34] evaluated re-analysis precipitation (ERA5) data against 5637 weather stations worldwide (2001–2020) and found that while ERA5 captures broad spatial patterns and monthly variability well in extratropical regions, it exhibits significant wet bias, low correlations, and large errors in tropical and dryland areas. It also underestimates extreme daily rainfall totals, meaning that precipitation trends, dry spells, and drought frequency in arid regions may be inaccurately represented [34].

This study investigates the performance of a suite of precipitation products from both satellite retrievals and models in a dry-climate region, where rain events that are more intense than usual may cause significant damage. Specifically, this study analyses four datasets (one satellite and three re-analysis products) and compares them to ground-recorded observations to determine which sources are more accurate and where improvements should be directed. Palm Desert in Southern California, a historically dry climate region, is chosen as the study area from 2000 through 2019. The study answers the following overall research question: What is the performance of different precipitation products in a dry-climate region? More specifically, what is their ability to estimate (1) the magnitude of average precipitation; (2) the magnitude of extreme precipitation events; (3) the occurrence of precipitation overall; and (4) the occurrence of extreme events? The methodological framework used to answer these four questions is presented in Section 2 and includes a description of the study area, the five datasets adopted in this work, and the statistical analysis. Results are illustrated and discussed in Section 3, whereas conclusions are drawn in Section 4.

2. Materials and Methods

2.1. Study Area

This study focuses on the area surrounding Palm Desert, California, United States, specifically within the following geographical coordinates: 117.20° W; 115.94° W and 33.25° N; 34.25° N (Figure 1). The climate in this area is historically dry, with warm winters and hot summers, receiving an average of less than 400 mm/year of precipitation [28]. The study area is primarily urban, surrounded by mountains and desert. The average high temperature in the warmest month in this area ranges from 68.1 °F to 84.2 °F, and the lowest temperature in the coolest month ranges from 39.2 °F to 58 °F [35].

In addition to its arid conditions, this region is characterized by complex terrain (as shown in Figure 2), which is well known to complicate both the observation and modeling of precipitation. Such topographic influences can introduce substantial uncertainty by affecting local atmospheric processes, sensor performance, and model representation of spatial variability. As discussed further in Section 4, these factors may be contributing to the discrepancies and limitations observed in the results presented here.

This area became of particular interest due to Hurricane Hilary, which devastated the region in August 2023. There were wind gusts up to 46 miles per hour, and a total rainfall estimate of 102–153 mm within three days [36,37]. The storm exceeded daily and monthly records for the area, and resulted in significant damage to roads, bridges, and infrastructure, and power loss [7].

2.2. Datasets

This study focuses on an almost 20 year-long time series, from 5 June 2000 through 31 December 2019. The ground data were collected from the National Oceanic and Atmospheric Administration (NOAA) Physical Sciences Laboratory “CPC Global Unified Gauge-Based Analysis of Daily Precipitation” product [38,39]. This dataset consists of daily precipitation values obtained from a network of gauges and uses the optimal interpolation objective analysis technique. The spatial resolution is 0.5 degrees latitude by 0.5 degrees longitude (Figure 2) and the temporal resolution is daily.

Four datasets were analyzed against the reference ground data and are described next: ERA5, MERRA2, WLDAS, and IMERG (Figure 2). A summary of each product’s characteristics is provided in Table 1. WLDAS is available at the finest spatial resolution, whereas ERA5 (both -MAX and -MIN), and MERRA2 have the highest temporal resolution. ERA5 is produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalysis of global climate data and by the Copernicus Climate Change Service (C3S). The MERRA2, WLDAS, and IMERG datasets were obtained from the NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) database.

The fifth generation ECMWF Reanalysis (ERA5) produces data using 4D-Variational data assimilation and modeling in the ECMWF Integrated Forecast System (IFS) [40]. This includes data at 137 model pressure levels, which are all interpolated to different pressure, temperature, and vorticity levels. Two variables were analyzed from this dataset: maximum total precipitation rate (MXTPR) and minimum total precipitation rate (MNTPR). For reference in this study, the MXTPR dataset will be noted as ERA5-MAX, and the MNTPR dataset will be noted as ERA5-MIN. The temporal range is 1 January 1940 to present.

The Modern-Era Retrospective analysis for Research and Applications, version 2, (MERRA2) uses a combination of observations, microwave sounders, and hyperspectral infrared radiance instruments to provide hourly data [41]. It provides a reanalysis of precipitation data collected from Goddard Earth Observing System Model, Version 5 (GEOS-5) and a data assimilation system. All fields are computed on a cubed grid, and the precipitation is then either provided on all 72 model layers or interpolated to 42 pressure levels. This dataset is available from 9 January 1980 to present. Figure 2b shows the four MERRA2 pixels that cover the study area.

The Western Land Data Assimilation System (WLDAS) utilizes the NASA Land Information System (LIS) and meteorological observations to simulate precipitation. This product is catered to the Western United States, and produces daily high-resolution data [42]. The available time series is 6 January 1979 through 31 December 2023.

The Integrated Multi-satellitE Retrievals for GPM (IMERG) dataset uses the GPM (Global Precipitation Measurement) Core Observatory satellite to combine infrared and microwave sensor readings and precipitation observations from the Tropical Rainfall Measuring Mission (TRMM) and Global Precipitation Measurement (GPM) satellite missions [43]. IMERG is available from 5 June 2000 to present at half-hourly/0.1° temporal/spatial resolution. The IMERG Final V07 product, used in this work, is a research-grade dataset that incorporates gauge corrections for improved accuracy [16]. While it captures broad spatial patterns and temporal variability well, performance varies by region and surface type, with reduced skill over frozen areas, complex terrain, and some tropical or dryland regions.

2.3. Data Analysis

The first set of analyses to assess the performance of the four precipitation products in Palm Desert is based on scatterplots and cumulative distribution functions (CDFs).

The percentage of days with no precipitation (dry days) and days with precipitation (wet days) was then investigated for each product. The threshold for wet days included any precipitation detected over 0.1 mm/day.

Next, a suite of common statistical metrics was used to further investigate the products’ performance: bias ratio, mean error, root mean square error, and Pearson’s correlation coefficient [44]. The bias ratio compares each evaluation dataset to the reference dataset, as shown in Equation (1). Ideally, there would not be any bias in the datasets, meaning the bias ratio would be one.

B i a s R a t i o = \frac{\frac{\sum_{i = 0}^{n} {\hat{x}}_{i}}{n}}{\frac{\sum_{i = 0}^{n} x_{i}}{n}}

(1)

where

\hat{x}

is the estimated precipitation value, x is the reference precipitation value, and n is the total number of data points.

The mean error measures the average difference between two datasets. The closer to zero this metric is, the smaller the difference between the two, as shown in Equation (2).

M e a n E r r o r = \frac{\sum_{i = 0}^{n} {\hat{x}}_{i}}{n} - \frac{\sum_{i = 0}^{n} x_{i}}{n}

(2)

The root mean square error (RMSE) is a measure of how far the estimated precipitation is from the reference (in situ) values. RMSE captures the overall magnitude of errors by emphasizing large deviations, something that mean error and bias ratio cannot do. It is calculated as the square root of the average of the squared differences between them:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - {\hat{x}}_{i})}^{2}}

(3)

The Pearson’s correlation coefficient measures the linear association between two datasets. Ideally, the datasets would have a linear relationship, which corresponds to value of 1 for this metric:

C o r r = \frac{n \sum_{i = 0}^{n} {\hat{x}}_{i} x_{i} - \sum_{i = 0}^{n} \hat{x} \sum_{i = 0}^{n} x_{i}}{\sqrt{[n \sum_{i = 0}^{n} {\hat{x}}_{i}^{2} - {(\sum_{i = 0}^{n} {\hat{x}}_{i})}^{2}] [n \sum_{i = 0}^{n} {x_{i}}^{2} - {(\sum_{i = 0}^{n} x_{i})}^{2}]}}

(4)

Contingency tables were then created to calculate the number of hit cases (H), missed events (M), false alarms (F), and correct no precipitation presence (Z). H represents the number of times the estimate correctly detected the presence of precipitation [44]. M refers to times in which precipitation was not detected, but the reference did observe rain. F represents the number of times the estimate did detect precipitation when in fact it had not rained. Z refers to the number of times both estimate and reference detected no precipitation.

This study used contingency tables to calculate error statistics relative to a precipitation threshold. The probability of detection (POD) measures the likelihood of a product to correctly detect the presence of precipitation by comparing the number of times the evaluation dataset correctly detected precipitation and the number of times it incorrectly detected precipitation:

P O D = \frac{H}{(H + M)}

(5)

POD can be computed as a function of a threshold defined based on reference precipitation, as shown in Table 2 [45].

The missed precipitation represents the ratio of the volume of precipitation not captured by the estimation product to the total volume of precipitation captured by the reference product, with respect to the reference rain threshold [45].

Missed precipitation is also computed based contingency values defined in Table 2, where the number of missed events is divided by the sum of the number of missed events and hit cases:

M i s s e d P r e c i p i t a t i o n R a t i o = \frac{M}{(H + M)}

(6)

The false alarm rate (FAR) quantifies the number of times the remote dataset incorrectly detected the presence of precipitation by comparing the number of times the remote equipment incorrectly and correctly detected the presence of precipitation:

F A R = \frac{F}{(F + Z)}

(7)

FAR, or the likelihood of an evaluation product incorrectly detecting precipitation when in fact it does not rain, can be calculated as a function of threshold defined based on the evaluation product, as presented in Table 3 [45].

The falsely detected precipitation represents the ratio of the volume of precipitation incorrectly captured by the estimation product to the total volume of precipitation detected by the estimation product, with respect to the reference rain threshold.

Falsely detected precipitation is computed as a function of the FAR contingency values, where the number of false detections is divided by the sum of the number of false detections and hit cases:

F a l s e l y D e t e c t e d P r e c i p i t a t i o n R a t i o = \frac{F}{(F + H)}

(8)

An issue with using POD and FAR is that they are inherently interdependent, meaning that improving one often comes at the expense of the other. Increasing POD typically requires issuing more events, which can inadvertently raise the number of false alarms. Conversely, by being more conservative in detecting events, some actual events may go undetected. This trade-off makes it challenging to assess overall skill using these metrics in isolation, as a product could appear skillful in one metric while performing poorly in the other. Metrics like the Equitable Threat Score (ETS) and the True Skill Statistic (TSS) help address this limitation by combining information on hits, misses, and false alarms into a single score [46]. ETS is commonly used for operational evaluation of forecast performance across a range of event thresholds. A perfect forecast receives a score of one, while a forecast no better than random chance scores zero. The metric is derived by adjusting the traditional threat score to account for the score expected from random forecasts. However, despite its equitability and widespread adoption, ETS artificially declines when the event of interest becomes rare, giving the misleading impression of reduced forecast skill [47]. It is computed as follows:

E T S = \frac{H - e}{H + F + M - e}

(9)

e = \frac{(H + F) (H + M)}{H + F + M + Z}

(10)

The TSS, or Hanssen–Kuipers skill score, is another standard metric for assessing precipitation forecasts (e.g., [48]). TSS is often a more reliable indicator than ETS because it scales with the frequency of the event being predicted and assigns equal importance to correctly forecasting both occurrences and non-occurrences [49]:

T S S = \frac{H Z - F M}{(H + M) (F + Z)}

(11)

Both ETS and TSS are computed for contingency values relative to reference precipitation, defined in Table 2.

3. Results

The time series of precipitation events recorded by each dataset during the study period (from 5 June 2000 to 31 December 2019) are illustrated in Figure 3. The peaks observed in the time series are clearly identifiable across all datasets. These events exhibit strong temporal alignment, indicating consistent detection of rainfall occurrences across the observational and estimation products. However, discrepancies in peak magnitudes are evident, reflecting systematic biases wherein certain estimation products either overestimate or underestimate the actual precipitation intensities associated with these events.

A more detailed investigation of these discrepancies is conducted through quantitative analyses aimed at characterizing the deviations in reported precipitation across datasets. These analyses facilitate a rigorous evaluation of each product’s performance, enabling identification of the temporal and contextual conditions under which discrepancies are most pronounced.

An initial step in diagnosing the discrepancies among estimation products involved isolating two large precipitation events within the time series and studying their evolution in time. Figure 4 shows time series for two specific events: 29 December 2004 and 14 February 2019. In the first event (Figure 4a), the rain gauges reported the rainfall event starting on 28 December 2004, continuing to the next day at a lower rate, and having concluded by 30 December 2004. IMERG, MERRA2, and WLDAS capture the event timing, although WLDAS underestimates the amount of rainfall on both days of the event. ERA5-MAX and ERA5-MIN show a delay in the event detection, with the peak occurring on the second day rather than the first. ERA5-MAX overestimates the precipitation rate, whereas ERA5-MIN appears to report values much closer to the reference rainfall. In summary, if IMERG, MERRA2, and WLDAS are better at estimating the timing of this event, IMERG, MERRA2, and ERA5-MIN are better at estimating its peak magnitude. The second event (Figure 4b) seems to have perfect timing across all estimation products with a peak on 14 February 2019. However, the magnitude of such peak varies across datasets. Specifically, ERA5-MIN and MERRA2 underestimated the peak magnitude, while ERA5-MAX, WLDAS, and IMERG overestimated it. Across the two events (Figure 4), IMERG and MERRA2 consistently correctly reported the timing of precipitation, and ERA5-MIN and MERRA2 reported similar magnitudes of precipitation to the reference. This could indicate a good performance of IMERG relative to the other products at both estimating the timing and magnitude of precipitation events across the study area, although further investigation is required.

The scatterplots for each estimation product versus the reference dataset are presented in Figure 5. The linear relationship between each product and the reference product is positive, although not very strong, indicating there may be room for improvement in each product. Broad scatter in the data, often accompanied by low correlation values, is common when comparing ground-based precipitation observations—whether from gauges or weather radar—to satellite-derived or model-based precipitation estimates. This behavior has been widely documented in previous studies, reflecting differences in spatial resolution, sampling strategies, and measurement uncertainties across observing platforms (e.g., [50,51,52,53]).

Six percentiles were computed for each product for precipitation rates greater than 0.1 mm/day (Table 4). Percentiles of each estimation product were generally similar to those of the reference dataset. ERA5-MAX values are higher than the reference ones, which is expected given that this dataset provides maximum daily precipitation. WLDAS was consistently the closest in value at each percentile, with values that almost matched those of the reference dataset. The CDF plots in Figure 6 provide a visualization of these percentile values and how that compare to one another.

In presenting and discussing our results, we define extreme precipitation as values exceeding the 90th percentile, with particular attention given to more intense events above the 95th and 99th percentiles. The CDF of the reference dataset illustrates that 90% of the daily precipitation values recorded were 7.17 mm/day or less. IMERG reported only 4.27 mm/day at the 90th percentile. However, it was almost the same value as the reference dataset in the 99th percentile with 27.41 mm/day or less reported (Table 4). This indicates that the average amount of precipitation being reported by IMERG is less than the actual occurrence, but the amount of precipitation being reported for extreme events is more likely to be accurate. Something similar could be stated for ERA5-MIN and MERRA2, although these percentile values are much closer to the reference data at the 90th percentile, and ERA5-MIN is much closer to the reference dataset value at the 99th percentile than MERRA2. WLDAS is very close to the reference at both the 90th and 99th percentiles, indicating that this estimation product may be accurately reporting precipitation values during both typical and extreme events. ERA5-MAX produces larger estimates of precipitation at all percentiles, which is expected given the nature of this product to estimate higher rainfall.

The overall ratio of “wet” days, defined as any day that received more than 0.1 mm of rain in one day, is analyzed in Figure 7.

Across all products, the proportion of wet days—defined as days with measurable precipitation—remained within the 10% to 40% range of total annual days. with the reference showing a ratio of 19.5% wet days over the entire study period (number of days that recorded a value larger than 0.1 mm/day divided by the total number of days in the time series). ERA5-MAX reported 37.0% of the total time series as wet days, ERA5-MIN reported 17.3%, MERRA2 reported 26.2%, WLDAS reported 17.7%, and IMERG reported 31.8% of the total days reported as wet days. Among these, ERA5-MIN and WLDAS exhibited the highest concordance with the reference dataset, with the closest wet day percentages. The remaining products were still close, all less than 20% higher. However, given the high sensitivity of the study region to precipitation, even minor deviations in the frequency or detection of wet days among the products may carry significant implications.

3.1. Continuous Statistics

First, the overall bias ratio was computed for each estimated dataset with respect to the reference (Figure 8a). Then, a relative bias ratio was computed at three different thresholds, i.e., estimation rain rate higher than the 75th, 90th, and 95th percentiles (refer to Table 5 for percentile values). The bias ratio exhibits a consistent trend across all datasets, deteriorating progressively as the threshold increases. Among the datasets analyzed, ERA5-MAX exhibits the highest bias (greatest deviation from unity), whereas WLDAS and ERA5-MIN show the most favorable bias ratios. MERRA2 and IMERG present similar biases with a nearly linear increase with increasing threshold. They are similar in value to the biases of WLDAS and ERA5-MIN, although they deviate further from unity as the threshold increases. Insights gained could potentially inform enhancements to bias correction strategies that are rain rate dependent.

The behavior of the mean error closely mirrors that of the bias ratio, reflecting their inherent similarity (Figure 8b). However, unlike the bias ratio, the mean error quantifies the magnitude of rainfall misestimation, providing a more direct measure of the error in precipitation amounts. The mean error (for all non-zero values) remains near zero across all datasets except for ERA5-MAX, which, once again, is expected given the nature of this dataset. However, as the percentile threshold increases, the ability of the products in capturing rainfall magnitudes observed by the gauges declines notably. For extreme precipitation events (e.g., the 95th percentile), ERA5-MAX exhibits mean errors reaching up to 20 mm/day. MERRA2 and IMERG show errors between 7 mm/day and 14 mm/day for the higher thresholds (90th and 95th percentiles). These error magnitudes are substantial, particularly in the context of the arid climate region examined in this study.

While the bias ratio and mean error discussed above mainly reveal whether each product tends to over- or under-estimate rainfall on average, they can hide large errors because such overestimates and underestimates cancel each other out. RMSE instead reflects both the size and variability of these errors, providing information on how accurately the products capture actual rainfall volumes, especially during heavy-rain events (Figure 8c). ERA5-MAX still exhibits the largest rainfall estimation errors, but the separation between ERA5-MAX and the IMERG and MERRA-2 products is less pronounced in terms of RMSE than it is for bias ratio or mean error. RMSE indicates that ERA5-MAX’s performance, while still the poorest, is more comparable to the other products when considering total error magnitude rather than directional bias alone.

Figure 9 presents Pearson’s correlation coefficient for each dataset. A general decline in correlation is observed as the percentile threshold increases. Notably, ERA5-MAX consistently exhibits the highest correlation values. This indicates that, despite the discrepancies in rainfall amounts discussed above, ERA5-MAX aligns most closely with the temporal pattern of rainfall observed in the reference dataset. In contrast, MERRA2 exhibits the weakest correlation with ground-based observations, which may be attributed to its relatively coarse spatial resolution.

3.2. Contingency Metrics

Figure 10a illustrates the probability of detection relative to the estimated precipitation threshold, computed based on Table 2. ERA5-MAX exhibits the highest POD, with a probability of detection of 95% or higher when the threshold is 2 mm/day or greater, indicating strong performance in identifying rainfall events. While MERRA2 records a relatively low POD of 70% at the minimal threshold of 0.1 mm/day, it exhibits improved detection capabilities at higher thresholds, supporting its effectiveness in capturing extreme precipitation events in arid regions. In contrast, WLDAS consistently yields lower POD values across all thresholds. Despite its ability to closely replicate the overall distribution of rainfall, WLDAS appears limited in accurately detecting the timing of rainfall events, suggesting a deficiency in temporal precision.

Figure 10b illustrates false alarm ratios calculated based on the contingency matrix shown in Table 3 as a function of different reference precipitation thresholds. All datasets exhibit a similar decreasing trend in FAR, with ERA5-MIN consistently achieving the lowest values. Conversely, ERA5-MAX shows the highest FAR, indicating a greater tendency to report rainfall when none occurred. These results are expected given that the two products offer a minimum and maximum rainfall estimate during the day.

This is particularly important for the reliability of early warning systems, which depend heavily on accurate rainfall detection to issue timely alerts for potential flooding or other hydrometeorological hazards. A low FAR minimizes the risk of false alarms, which can erode public trust and lead to reduced responsiveness over time. Thus, the consistently low FAR across these datasets enhances their suitability for operational use in early warning and disaster preparedness frameworks, particularly in regions where rainfall is infrequent but can have significant impacts.

The analysis of missed precipitation presented in Figure 11a reveals that all estimation products exhibit a similar decreasing trend as the threshold increases, as expected. WLDAS consistently shows a higher proportion of missed precipitation compared to the other datasets, which are much closer to one another. While previous findings indicated that WLDAS was among the most precise in replicating the overall rainfall distribution, it was also noted that its temporal accuracy—specifically, the correct timing of rainfall events—was likely the poorest. The elevated missed precipitation ratios observed for WLDAS in Figure 11a corroborate this finding, indicating a consistent under detection of rainfall events.

Missed precipitation can be particularly problematic in arid and semi-arid regions, even at low rainfall intensities, due to the critical role that every precipitation event plays in these water-scarce environments. In such regions, rainfall events are infrequent and highly variable, and even small amounts can have significant ecological, agricultural, and hydrological impacts. Missing these events can lead to underestimation of available water resources, misinformed drought assessments, and inadequate planning for water supply and agricultural management. Furthermore, missed precipitation can compromise the effectiveness of hydrological models and early warning systems, which rely on accurate detection of rainfall to forecast runoff, soil moisture, and potential flood or drought conditions. Inaccurate representation of precipitation events can thus exacerbate the vulnerability of communities and ecosystems already under stress from limited water availability.

Similarly, the falsely detected precipitation graph in Figure 11b shows an overall decreasing trend, with WLDAS seemingly set apart from the others. The missed and false precipitation ratio values illustrate that ultimately WLDAS was missing the highest volume of precipitation (although the FAR was relatively low) and reported more precipitation than what was detected by the reference product. As the reference rain threshold increases to represent more extreme rain events, the volume of rain missed or falsely detected decreases.

However, when looking at ETS, WLDAS is outperformed only by ERA5-MIN, which exhibits the highest score among all products, with values very similar to those of MERRA2 (Figure 12a). As noted in the methodology, ETS decreases (by definition) as events become rarer—that is, when higher reference precipitation thresholds are used. TSS, shown in Figure 12b, balances the ability to correctly detect both events and nonevents, making it less sensitive to event frequency and offering a robust assessment of overall forecast skill. The different precipitation products exhibit a wide range of TTS values, with ERA5-MIN presenting the highest scores, followed by MERRA2. This variation indicates that some products consistently capture event occurrences better than others, while some produce more false alarms or miss events, as also shown in Figure 10 and Figure 11. The differences in TTS (and the other scores) can also arise from the different spatial and temporal resolutions and the inherently different source of each product (satellite, model, in situ). Consequently, even when overall trends in rainfall are similarly represented, the skill of individual products in detecting specific events—especially extreme or localized rainfall (towards higher values of the x-axis of Figure 12a—can differ. This wide spread in TTS highlights the importance of carefully evaluating and selecting products for operational use or scientific studies, rather than assuming that all datasets perform equivalently across different rainfall intensities and event types.

3.3. Dataset Ranking

To provide a high-level glance at how each estimation product performed and compared to one another, a ranking system was used to assist in answering each research question. The rankings are listed in Table 5, Table 6, Table 7 and Table 8. A simple system of assigning each product a number, 1 through 5, based on which product had the best (one) and worst (five) results compared to the other products.

To answer the first research question, i.e., how well different products estimate average precipitation in a dry-climate region, the following metrics were considered: CDF, overall bias ratio, overall mean error, overall RMSE, and overall correlation coefficient. Overall WLDAS ranks best with all other products performing similar to each other. For the CDF, the product that presented the closest 50th quantile to the one of the reference dataset was ranked first. Although WLDAS and ERA5-MAX rank high for CDF, ERA5-MAX is characterized by a large positive bias. While ERA5-MAX and WLDAS differed from the reference average by 0.10 mm/day or less (about 10% of the reference dataset average), the remaining three estimation products differed by at least 0.30 mm/day (about 30% of the reference dataset average). This is just to emphasize that the difference in statistical results between products is not necessarily clearly illustrated by the rankings, and the actual results must still be taken into consideration while evaluating a product’s performance. Another difference which is not clear from the rankings is that all bias ratios (and mean errors) were very close to one another, except for ERA5-MAX, which was significantly higher. Furthermore, ERA5-MAX, IMERG, and MERRA2 showed comparable RMSEs, whereas ERA5-MIN and WLDAS are characterized by smaller RMSEs. The correlation coefficient has larger variability and is generally well represented by the ranking.

To answer the second question, i.e., the performance of precipitation products during extreme precipitation events, the same metrics showed in Table 5 were used, but at the 95th percentile instead (Table 6). As with the findings for average precipitation, WLDAS ranks the highest among the five products, exhibiting the lowest bias ratio, mean error, and RMSE as well as the second-best 95th percentile (i.e., closest to the one of the ground reference) and correlation coefficient. Nevertheless, in this case, ERA5-MAX comes last, followed by MERRA2, and IMERG. This is due to the fact that ERA5-MAX is characterized by the largest errors (bias ratio, mean error, and RMSE). The issue is not that ERA5-MAX fails to generate extreme rainfall events; rather, it produces them with excessive magnitude, and the evaluation metrics penalize this overestimation.

At the 95th percentile, cumulative precipitation rates were comparable among the reference dataset, WLDAS, and ERA5-MIN (as shown in Table 4). In contrast, MERRA2 and IMERG tended to underestimate high rainfall rates, while ERA5-MAX significantly overestimated them. The bias ratio for events at or above the 95th percentile was closest to the ideal value of 1 for both WLDAS and ERA5-MIN. IMERG and MERRA2 both presented bias ratios exceeding 2, while ERA5-MAX had a bias ratio approaching 3. A similar pattern emerged for mean error and RMSE, with WLDAS and ERA5-MIN nearly tied for the most accurate estimates. These were followed by MERRA2 and IMERG, and finally ERA5-MAX, which had the largest deviation from the ideal. Nevertheless, the correlation coefficient told a different story: ERA5-MAX had the highest correlation with the reference data in the case of rare events, followed—though more distantly—by WLDAS and IMERG. This marked a shift from the rankings observed at the 50th percentile, largely due to a notable drop in ERA5-MIN’s correlation for extreme precipitation events. Overall, these results suggest that ERA5-MIN may be more reliable for estimating average precipitation than for capturing extreme rainfall events, a conclusion supported by its performance across both average and high-intensity precipitation metrics.

The third research question focused on the capabilities of the different products to detect overall precipitation. The ranking of the datasets utilized probability of detection, false alarm rate, missed precipitation fraction, falsely detected precipitation fraction, ETS, and TSS (Table 7). Similar to research objective 1, the values of these metrics were taken for the overall precipitation (larger than 0.1 mm/day).

For the percentage of wet days—defined as days receiving at least 0.1 mm of rainfall—the ranking was determined based on the overall proportion of such days throughout the study period. WLDAS and ERA5-MIN aligned most closely with the reference dataset, followed—though less closely—by MERRA2, IMERG, and ERA5-MAX. ERA5-MAX recorded the highest average probability of detection, outperforming IMERG and MERRA2 by over 10%. In contrast, ERA5-MIN and WLDAS trailed those two products by an additional 10%. Missed precipitation rates were comparable among ERA5-MIN, IMERG, and MERRA2, with ERA5-MAX performing slightly better and WLDAS showing a higher rate than all others. IMERG, ERA5-MAX, and MERRA2 exhibited very similar values of falsely detected precipitation. ERA5-MIN had a noticeably lower rate, though not to the extent of WLDAS, which again stood out with a substantially higher rate than the rest. ETS and TSS offer a more holistic view of the previously discussed metrics, with ERA5-MIN presenting the highest ETS and TSS. Taken together, these findings suggest that ERA5-MIN performs particularly well, whereas WLDAS and IMERG are the least advisable options when accurate detection of overall precipitation is a key criterion.

The analysis of the capability of the various products to detect extreme precipitation rates is based on the same metrics shown in Table 7, but using values computed for the 95th percentile (Table 8). In response to objective 4, ERA5-MIN ranks first, followed by IMERG, whereas WLDAS presents the lowest ranking. The probability of detection for extreme precipitation events increases significantly across all estimation products—except for WLDAS, which remains below 90% even at thresholds as high as 20 mm/day. This suggests that, although WLDAS performs reasonably well in estimating daily rainfall amounts, it is unreliable in detecting the timing or occurrence of rainfall, particularly during extreme events. The false alarm rate for extreme events was consistently similar across all products, showing no major outliers. However, both missed precipitation and false precipitation followed a pattern similar to POD: values were comparable among all products except WLDAS, which exhibited significantly higher errors in both categories. In terms of ETS and TSS, ERA5-MIN once again ranks the highest, whereas ERA5-MAX presents the poorest performance. These findings highlight a notable weakness in WLDAS—its limited ability to accurately detect the presence of rainfall, especially during high-intensity events.

4. Conclusions

Climate change is driving increasingly dramatic shifts in weather patterns across the globe. One notable example is the rising frequency, intensity, and duration of precipitation events in the southwestern United States, a region traditionally characterized by arid and semi-arid conditions. These extreme rainfall events, which historically occurred less than once in a century, are now becoming more common. As a result, existing infrastructure—designed for much drier conditions—is often overwhelmed and prone to failure, highlighting the urgent need for climate-resilient planning and adaptation strategies.

This study answered four research questions posed in Section 1:

What is the ability of different precipitation products to estimate the magnitude of average precipitation locally in a dry climate region?
The assessment of average precipitation estimates in Palm Desert, California shows that, although the ranking system provides a general overview, it can obscure important differences among products. For instance, ERA5-MAX appears strong because its 50th percentile aligns closely with the reference dataset, yet it also exhibits a substantial positive bias. WLDAS and ERA5-MAX differ from the reference mean by no more than 10%, whereas the other products deviate by at least 30%. ERA5-MAX, IMERG, and MERRA-2 show comparable RMSEs, while ERA5-MIN and WLDAS are characterized by smaller errors. Correlation coefficients display larger variability and are generally reflected well in the ranking, with ERA5-MAX showing the best alignment with the CPC dataset. However, matching a median or average does not guarantee accurate representation of full precipitation distribution, and the inherent smoothing, algorithmic assumptions, and averaging in gridded products are the main cause of biases and errors.
What is the ability of different precipitation products to estimate the magnitude of extreme precipitation events in a dry area?
For rare precipitation events, many of the patterns seen for average precipitation persist, but some important differences emerge. ERA5-MAX shows the largest bias ratio, mean error, and RMSE, not because it fails to generate extreme events, but because it produces them with excessive magnitude. MERRA-2 and IMERG generally underestimate high rainfall rates, while WLDAS and ERA5-MIN yield 95th-percentile values closer to the reference dataset. In contrast, ERA5-MAX correlates most strongly with the reference data for extreme events than other products. These results highlight that products may perform differently under extreme versus average conditions and that rankings alone cannot fully represent those distinctions.
Differences in how precipitation products perform under rare, extreme events often stem from the coarse spatial resolution and inherent smoothing of grid-based reanalysis models and satellite observations. Such smoothing tends to blur and dilute intense, localized rainfall peaks, causing coarse products to underestimate extremes. However, when a product instead attempts to compensate, it can generate excessive magnitudes, inflating bias ratio, RMSE, and mean error.
What is the ability of different precipitation products to estimate the occurrence of overall precipitation in a dry climate region?
The overall ability of the five products to accurately detect the presence of rainfall was assessed using contingency metrics, including probability of detection, false alarm rate, missed precipitation fraction, and false precipitation fraction. Based on these metrics, ERA5-MAX shows the highest probability of detection, IMERG and MERRA-2 follow, and ERA5-MIN and WLDAS detect substantially fewer events. Missed-event rates are similar for ERA5-MIN, IMERG, and MERRA-2, slightly lower for ERA5-MAX, and highest for WLDAS. False-alarm rates cluster closely among IMERG, ERA5-MAX, and MERRA-2; ERA5-MIN is lower, and WLDAS is again noticeably higher. Composite scores (ETS and TSS) favor ERA5-MIN, indicating more balanced detection performance overall, while WLDAS and IMERG perform less reliably.
These differences likely arise from how each product handles low-intensity precipitation. Products such as ERA5-MAX and IMERG tend to generate more light rainfall events, boosting detection but also increasing false alarms. WLDAS appears particularly sensitive in this regard, triggering too many near-threshold events. In contrast, ERA5-MIN applies a more conservative thresholding or filtering of light precipitation, leading to fewer false detections and more accurate classification overall. Differences in model physics, sensor noise characteristics, and drizzle-handling schemes likely drive these systematic biases in wet-day identification.
What is the ability of different precipitation products to estimate the occurrence of extreme precipitation in a dry climate region?
When looking at the detection of extreme precipitation events (95th percentile), ERA5-MIN ranks first overall, followed by IMERG; by contrast, WLDAS ranks lowest. For most thresholds, detection increases for all products except WLDAS, which remains below 90% even at high thresholds (e.g., ≥20 mm/day). False-alarm rates are fairly similar among products. However, WLDAS shows significantly higher missed- and false- precipitation rates than the others. Composite skill scores (ETS and TSS) again favor ERA5-MIN, while ERA5-MAX performs worst. WLDAS appears relatively good at estimating average rainfall amounts, but notably weak at reliably detecting when extreme rainfall events occur. As mentioned above, global and/or coarse-grid datasets struggle with representing intense, localized rainfall events. As a result, some products miss the timing of extreme events (as in WLDAS), even if they approximate overall daily rainfall reasonably well.

In summary, some products, like ERA5-MIN, manage to strike a balance between detecting the occurrence of extreme rainfall and estimating its magnitude reasonably well. In contrast, WLDAS—while good at reproducing daily rainfall amounts overall—struggles to reliably capture the timing and occurrence of intense rainfall events. As a result, WLDAS often misses extreme events or records them at the wrong time, which degrades its probability of detection (POD), increases missed-event counts, and lowers skill scores (like ETS and TSS).

Part of the underlying cause is that precipitation extremes are often short-lived, highly localized, convective events. Global-scale reanalysis and coarse-gridded satellite products tend to smooth spatial and temporal variability, which often misses these sharp spikes in rainfall. Meanwhile, even when a product does detect an event (as with IMERG or ERA5-MAX), errors in estimating the intensity—either overestimating or underestimating—degrade the bias ratio, mean error, and RMSE. This can happen if the algorithms or physical parameterizations poorly represent convective processes or sub-grid rainfall dynamics.

Finally, the fact that some products (e.g., ERA5-MAX) still show reasonable correlation with reference data despite large magnitude errors reveals another truth: a product may get the timing of events roughly right (wet vs. dry days), even if it fails to reproduce the true rainfall amounts. Correlation reflects temporal alignment more than magnitude accuracy—so even a biased product can perform well on correlation even if its errors in intensity are large.

In addition to its predominantly dry climate, this region is characterized by highly complex terrain—a factor well known to challenge both the observation and modeling of precipitation. These topographic influences can introduce substantial uncertainty by shaping local atmospheric dynamics and limiting the ability of observational networks and models to accurately resolve spatial variability. When combined with the relatively coarse resolution of several of the datasets used here, these factors likely contribute to the discrepancies identified in our analysis. The influence of terrain–resolution interactions is evident in Figure 2. Higher-resolution datasets capture precipitation enhancement on the windward slopes, reflecting their improved ability to represent orographic processes. In contrast, lower-resolution datasets—unable to adequately resolve the underlying terrain—fail to capture this signal. This mismatch underscores the importance of considering both topographic complexity and dataset resolution when interpreting precipitation estimates in mountainous, arid environments.

Future work should investigate the conclusions drawn above in different regions characterized by a similar climate to generalize the results presented in this study. Additional satellite-based products and re-analysis data should also be assessed together with ground radar observations, if available. Time series should also be extended to a longer temporal range.

The impact of bias correction techniques applied to each estimation product could also be considered, as they can significantly influence overall performance. Bias corrections are often implemented to align modeled or satellite-derived precipitation estimates with observed data, improving accuracy in magnitude and distribution. However, these adjustments can also introduce new uncertainties or mask underlying deficiencies in the original datasets. Evaluating how each product’s performance changes before and after bias correction can provide valuable insight into the true capabilities of the raw estimation models versus the effectiveness of the correction methods themselves. This distinction is especially important when comparing products across different climate regimes or event intensities, such as average versus extreme precipitation. In future analyses, incorporating a systematic comparison of bias-corrected versus uncorrected outputs could help clarify whether observed improvements are due to the model’s inherent skill or the strength of the correction algorithm applied.

Author Contributions

Conceptualization, V.M.; methodology, V.M.; formal analysis, N.B.-W.; investigation N.B.-W.; data curation, N.B.-W.; writing—original draft preparation, N.B.-W.; writing—review and editing, V.M.; visualization, N.B.-W.; supervision, V.M.; project administration, V.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created. IMERG data are available at https://gpm.nasa.gov/data/directory, accessed on 1 December 2025. ERA5 data are available at https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview, accessed on 1 December 2025. CPC data are available at https://psl.noaa.gov/data/gridded/data.cpc.globalprecip.html, accessed on 1 December 2025. MERRA2 data are available at https://gmao.gsfc.nasa.gov/gmao-products/merra-2/data-access_merra-2/, accessed on 1 December 2025. WLDAS data are available at https://disc.gsfc.nasa.gov/datasets/WLDAS_NOAHMP001_DA1_D1.0/summary, accessed on 1 December 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Milly, P.C.D.; Betancourt, J.; Falkenmark, M.; Hirsch, R.M.; Kundzewicz, Z.W.; Lettenmaier, D.P.; Stouffer, R.J. Stationarity Is Dead: Whither Water Management? Science 2008, 319, 573–574. [Google Scholar] [CrossRef]
Ragno, E.; AghaKouchak, A.; Love, C.A.; Cheng, L.; Vahedifard, F.; Lima, C.H.R. Quantifying Changes in Future Intensity-Duration-Frequency Curves Using Multimodel Ensemble Simulations. Water Resour. Res. 2018, 54, 1751–1764. [Google Scholar] [CrossRef]
Mishra, A.; Mukherjee, S.; Merz, B.; Singh, V.P.; Wright, D.B.; Villarini, G.; Paul, S.; Kumar, D.N.; Khedun, C.P.; Niyogi, D.; et al. An Overview of Flood Concepts, Challenges, and Future Directions. J. Hydrol. Eng. 2022, 27, 03122001. [Google Scholar] [CrossRef]
Wang, L.; Cui, S.; Li, Y.; Huang, H.; Manandhar, B.; Nitivattananon, V.; Fang, X.; Huang, W. A review of the flood management: From flood control to flood resilience. Heliyon 2022, 8, e11763. [Google Scholar] [CrossRef] [PubMed]
Yao, J.-Q.; Chen, J.; Zhang, T.-W.; Dilinuer, T.; Mao, W.-Y. Stationarity in the variability of arid precipitation: A case study of arid Central Asia. Adv. Clim. Change Res. 2021, 12, 172–186. [Google Scholar] [CrossRef]
Sun, F.; Roderick, M.L.; Farquhar, G.D. Rainfall statistics, stationarity, and climate change. Proc. Natl. Acad. Sci. USA 2018, 115, 2305–2310. [Google Scholar] [CrossRef]
Reinhart, B.J. Hurricane Hilary. Available online: https://www.nhc.noaa.gov/data/tcr/EP092023_Hilary.pdf (accessed on 12 February 2024).
Mousam, A.; Maggioni, V.; Delamater, P.L.; Quispe, A.M. Using remote sensing and modeling techniques to investigate the annual parasite incidence of malaria in Loreto, Peru. Adv. Water Resour. 2017, 108, 423–438. [Google Scholar] [CrossRef]
Serrat-Capdevila, A.; Valdes, J.B.; Stakhiv, E.Z. Water Management Applications for Satellite Precipitation Products: Synthesis and Recommendations. JAWRA J. Am. Water Resour. Assoc. 2014, 50, 509–525. [Google Scholar] [CrossRef]
Maggioni, V.; Meyers, P.C.; Robinson, M.D. A Review of Merged High-Resolution Satellite Precipitation Product Accuracy During the Tropical Rainfall Measuring Mission (TRMM) Era. 2016. Available online: https://repository.library.noaa.gov/view/noaa/69325 (accessed on 22 April 2025).
Michaelides, S.; Levizzani, V.; Anagnostou, E.; Bauer, P.; Kasparis, T.; Lane, J.E. Precipitation: Measurement, remote sensing, climatology and modeling. Atmos. Res. 2009, 94, 512–533. [Google Scholar] [CrossRef]
Guo, J.; Liu, H.; Li, Z.; Rosenfeld, D.; Jiang, M.; Xu, W.; Jiang, J.H.; He, J.; Chen, D.; Min, M.; et al. Aerosol-induced changes in the vertical structure of precipitation: A perspective of TRMM precipitation radar. Atmos. Chem. Phys. 2018, 18, 13329–13343. [Google Scholar] [CrossRef]
Iguchi, T.; Kozu, T.; Meneghini, R.; Awaka, J.; Okamoto, K. Rain-Profiling Algorithm for the TRMM Precipitation Radar. J. Appl. Meteorol. Climatol. 2000, 39, 2038–2052. [Google Scholar] [CrossRef]
Yang, S.; Smith, E.A. Convective-Stratiform Precipitation Variability at Seasonal Scale from Eight Years of TRMM Observations: Implications for Multiple Modes of Diurnal Variability. 2007. Available online: https://ntrs.nasa.gov/citations/20080023283 (accessed on 24 April 2025).
Kidd, C.; Bauer, P.; Turk, J.; Huffman, G.J.; Joyce, R.; Hsu, K.-L.; Braithwaite, D. Intercomparison of High-Resolution Precipitation Products over Northwest Europe. J. Hydrometeorol. 2012, 13, 67–83. [Google Scholar] [CrossRef]
Huffman, G.J.; Bolvin, D.T.; Nelkin, E.J.; Wolff, D.B.; Adler, R.F.; Gu, G.; Hong, Y.; Bowman, K.P.; Stocker, E.F. The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-Global, Multiyear, Combined-Sensor Precipitation Estimates at Fine Scales. J. Hydrometeorol. 2007, 8, 38–55. [Google Scholar] [CrossRef]
Sapiano, M.R.P.; Arkin, P.A. An Intercomparison and Validation of High-Resolution Satellite Precipitation Estimates with 3-Hourly Gauge Data. J. Hydrometeorol. 2009, 10, 149–166. [Google Scholar] [CrossRef]
Golian, S.; Javadian, M.; Behrangi, A. On the use of satellite, gauge, and reanalysis precipitation products for drought studies. Environ. Res. Lett. 2019, 14, 075005. [Google Scholar] [CrossRef]
En-nagre, K.; Aqnouy, M.; Bouadila, A.; Et-Takaouy, C.; Chahid, M.; Bouizrou, I.; Hilal, I.; El Messari, J.E.; Tariq, A. Assessment of three satellite precipitation products for hydrological studies in a data-scarce context: Ouarzazate Basin, Southern Morocco. Nat. Hazards Res. 2025, 5, 728–741. [Google Scholar] [CrossRef]
dos Reis, A.A.; Weerts, A.; Ramos, M.H.; Wetterhall, F.; dos Santos Fernandes, W. Hydrological data and modeling to combine and validate precipitation datasets relevant to hydrological applications. J. Hydrol. Reg. Stud. 2022, 44, 101200. [Google Scholar] [CrossRef]
Hinge, G.; Hamouda, M.A.; Long, D.; Mohamed, M.M. Hydrologic utility of satellite precipitation products in flood prediction: A meta-data analysis and lessons learnt. J. Hydrol. 2022, 612, 128103. [Google Scholar] [CrossRef]
Belabid, N.; Zhao, F.; Brocca, L.; Huang, Y.; Tan, Y. Near-Real-Time Flood Forecasting Based on Satellite Precipitation Products. Remote Sens. 2019, 11, 252. [Google Scholar] [CrossRef]
Morin, E.; Marra, F.; Armon, M. Dryland Precipitation Climatology from Satellite Observations. In Satellite Precipitation Measurement: Volume 2; Levizzani, V., Kidd, C., Kirschbaum, D.B., Kummerow, C.D., Nakamura, K., Turk, F.J., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 843–859. [Google Scholar] [CrossRef]
Serrat-Capdevila, A.; Merino, M.; Valdes, J.B.; Durcik, M. Evaluation of the Performance of Three Satellite Precipitation Products over Africa. Remote Sens. 2016, 8, 836. [Google Scholar] [CrossRef]
Nozarpour, N.; Mahjoobi, E.; Golian, S. Assessment of Satellite-based Precipitation Products in Monthly, Seasonal, and Annual Time-Scale over Iran. Int. J. Environ. Res. 2024, 18, 76. [Google Scholar] [CrossRef]
Moreno, M.; Bertolín, C.; Ortiz, P.; Ortiz, R. Satellite product to map drought and extreme precipitation trend in Andalusia, Spain: A novel method to assess heritage landscapes at risk. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102810. [Google Scholar] [CrossRef]
Vernimmen, R.R.E.; Hooijer, A.; Mamenun Aldrian, E.; van Dijk, A.I.J.M. Evaluation and bias correction of satellite rainfall data for drought monitoring in Indonesia. Hydrol. Earth Syst. Sci. 2012, 16, 133–146. [Google Scholar] [CrossRef]
Zhang, Y.; Wu, C.; Yeh, P.J.; Li, J.; Hu, B.X.; Feng, P.; Jun, C. Evaluation and comparison of precipitation estimates and hydrologic utility of CHIRPS, TRMM 3B42 V7 and PERSIANN-CDR products in various climate regimes. Atmos. Res. 2022, 265, 105881. [Google Scholar] [CrossRef]
Manzanas, R.; Amekudzi, L.K.; Preko, K.; Herrera, S.; Gutiérrez, J.M. Precipitation variability and trends in Ghana: An intercomparison of observational and reanalysis products. Clim. Change 2014, 124, 805–819. [Google Scholar] [CrossRef]
Zolina, O.; Kapala, A.; Simmer, C.; Gulev, S.K. Analysis of extreme precipitation over Europe from different reanalyses: A comparative assessment. Glob. Planet. Change 2004, 44, 129–161. [Google Scholar] [CrossRef]
Jiang, Q.; Li, W.; Fan, Z.; He, X.; Sun, W.; Chen, S.; Wen, J.; Gao, J.; Wang, J. Evaluation of the ERA5 reanalysis precipitation dataset over Chinese Mainland. J. Hydrol. 2021, 595, 125660. [Google Scholar] [CrossRef]
Dollan, I.J.; Maina, F.Z.; Kumar, S.V.; Nikolopoulos, E.I.; Maggioni, V. An assessment of gridded precipitation products over High Mountain Asia. J. Hydrol. Reg. Stud. 2024, 52, 101675. [Google Scholar] [CrossRef]
Daramola, M.T.; Xu, M. Recent changes in global dryland temperature and precipitation. Int. J. Climatol. 2022, 42, 1267–1282. [Google Scholar] [CrossRef]
Lavers, D.A.; Simmons, A.; Vamborg, F.; Rodwell, M.J. An evaluation of ERA5 precipitation for climate monitoring. Q. J. R. Meteorol. Soc. 2022, 148, 3152–3165. [Google Scholar] [CrossRef]
US Department of Commerce, National Oceanic and Atmospheric Administration. Climate. Available online: https://www.weather.gov/wrh/climate?wfo=sgx (accessed on 26 March 2025).
Atlas of the Biodiversity of California. Available online: https://wildlife.ca.gov/Data/Atlas (accessed on 31 March 2025).
Storm Events Database | National Centers for Environmental Information. Available online: https://www.ncdc.noaa.gov/stormevents/ (accessed on 14 May 2025).
Chen, M.; Shi, W.; Xie, P.; Silva, V.B.S.; Kousky, V.E.; Wayne Higgins, R.; Janowiak, J.E. Assessing objective techniques for gauge-based analyses of global daily precipitation. J. Geophys. Res. Atmos. 2008, 113, 2007JD009132. [Google Scholar] [CrossRef]
Xie, P.; Chen, M.; Yang, S.; Yatagai, A.; Hayasaka, T.; Fukushima, Y.; Liu, C. A gauge-based analysis of daily precipitation over East Asia. J. Hydrometeorol. 2007, 8, 607–626. [Google Scholar] [CrossRef]
Soci, C.; Hersbach, H.; Simmons, A.; Poli, P.; Bell, B.; Berrisford, P.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Radu, R.; et al. The ERA5 global reanalysis from 1940 to 2022. Q. J. R. Meteorol. Soc. 2024, 150, 4014–4048. [Google Scholar] [CrossRef]
Global Modeling and Assimilation Office Pawson. MERRA-2 tavg1_2d_flx_Nx: 2d,1-Hourly, Time-Averaged, Single-Level, Assimilation, Surface Flux Diagnostics V5.12.4. 2015. Available online: https://disc.gsfc.nasa.gov/datacollection/M2T1NXFLX_5.12.4.html (accessed on 1 October 2024).
Erlingis, J.; Li, B.; Rodell, M. WLDAS Noah-MP 3.6 Land Surface Model L4 Daily 0.01 Degree × 0.01 Degree, Version D1.0; Goddard Space Flight Center: Greenbelt, MD, USA, 2023. [CrossRef]
Huffman, G.J.; Stocker, E.F.; Bolvin, D.T.; Nelkin, E.J.; Tan, J. GPM IMERG Final Precipitation L3 1 Day 0.1 Degree × 0.1 Degree, V07; Goddard Space Flight Center: Greenbelt, MD, USA, 2023. [CrossRef]
Massari, C.; Maggioni, V. Error and Uncertainty Characterization. In Satellite Precipitation Measurement: Volume 2; Levizzani, V., Kidd, C., Kirschbaum, D.B., Kummerow, C.D., Nakamura, K., Turk, F.J., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 515–532. [Google Scholar] [CrossRef]
Anagnostou, E.N.; Maggioni, V.; Nikolopoulos, E.I.; Meskele, T.; Hossain, F.; Papadopoulos, A. Benchmarking High-Resolution Global Satellite Rainfall Products to Radar and Rain-Gauge Rainfall Estimates. IEEE Trans. Geosci. Remote Sens. 2010, 48, 1667–1683. [Google Scholar] [CrossRef]
Tartaglione, N. Relationship between precipitation forecast errors and skill scores of dichotomous forecasts. Weather Forecast. 2010, 25, 355–365. [Google Scholar] [CrossRef]
Stephenson, D.B.; Casati, B.; Ferro, C.A.; Wilson, C.A. The extreme dependency score: A non-vanishing measure for forecasts of rare events. Meteorol. Appl. A J. Forecast. Pract. Appl. Train. Tech. Model. 2008, 15, 41–50. [Google Scholar] [CrossRef]
Accadia, C.; Mariani, S.; Casaioli, M.; Lavagnini, A.; Speranza, A. Sensitivity of precipitation forecast skill scores to bilinear interpolation and a simple nearest-neighbor average method on high-resolution verification grids. Weather Forecast. 2003, 18, 918–932. [Google Scholar] [CrossRef]
Woodcock, F. The evaluation of yes/no forecasts for scientific and administrative purposes. Mon. Weather Rev. 1976, 104, 1209–1214. [Google Scholar] [CrossRef]
Oliveira, R.; Maggioni, V.; Vila, D.; Porcacchia, L. Using satellite error modeling to improve GPM-Level 3 rainfall estimates over the Central Amazon Region. Remote Sens. 2016, 10, 336. [Google Scholar] [CrossRef]
Khan, S.; Maggioni, V.; Kirstetter, P.E. Investigating the potential of using satellite-based precipitation radars as reference for evaluating multi-satellite merged products. J. Geophys. Res. Atmos. 2018, 123, 8646–8660. [Google Scholar] [CrossRef]
Maggioni, V.; Nikolopoulos, E.; Anagnostou, E.; Borga, M. Modeling satellite precipitation errors over mountainous terrain: The influence of gauge density, seasonality, and temporal resolution. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4130–4140. [Google Scholar] [CrossRef]
Maggioni, V.; Sapiano, M.; Adler, R.; Tian, Y.; Huffman, G. An error model for uncertainty quantification in high-time resolution precipitation products. J. Hydrometeorol. 2014, 15, 1274–1292. [Google Scholar] [CrossRef]

Figure 1. Study area map, measuring about 1.2° by 1°, surrounding Palm Desert, California, United States.

Figure 2. Map of the study area for average precipitation during the study period estimated by (a) CPC; (b) MERRA2; (c) ERA5-MIN; (d) ERA5-MAX; (e) WLDAS, and (f) IMERG.

Figure 3. Time series of daily precipitation estimated by (a) the in situ measurements; (b) IMERG; (c) ERA5-MIN; (d) ERA5-MAX; (e) MERRA2; and (f) WLDAS during the study period.

Figure 4. Time series of the different precipitation products during events recorded around (a) 29 December 2004 and (b) 14 February 2019.

Figure 5. Scatter plots of the estimated datasets and the reference in situ dataset over the study period, on a logarithmic scale.

Figure 6. Cumulative distribution functions (CDFs) of the estimate product datasets and the reference dataset.

Figure 7. Percentages of wet days per year, defined as days during which more than 0.1 mm of precipitation was observed.

Figure 8. (a) Bias ratio (unitless), and (b) mean error (mm/day), and (c) RMSE (mm/day) of all precipitation products with respect to the ground measurements for rates higher than the 75th, 90th, and 95th percentiles.

Figure 9. Correlation coefficient (unitless) for all products with respect to the reference dataset for precipitation rates higher than the 75th, 90th, and 95th percentiles. For the “overall” case, ERA5-MIN overlaps with WLDAS. For the 75th percentile, ERA5-MIN overlaps with IMERG.

Figure 10. (a) Probability of Detection (POD) as a function of precipitation threshold of each estimation product, and (b) False Alarm Ratio (FAR) as a function of reference precipitation threshold.

Figure 11. (a) Missed precipitation as a function of reference precipitation for each product, and (b) falsely detected precipitation as a function of reference precipitation for each product.

Figure 12. (a) Equitable Threat Score and (b) True Skill Score as a function of reference precipitation rate for each product.

Table 1. Key features of each dataset, including spatial and temporal resolutions, and precipitation units.

	ERA5	MERRA2	WLDAS	IMERG
Spatial Resolution	0.28° × 0.56°	0.5° × 0.625°	0.01° × 0.01°	0.1° × 0.1°
Number of Pixels within the Study Area	20	4	12,522	125
Temporal Resolution	Hourly	Hourly	Daily	Daily
Units	kg/m²s	kg/m²s	kg/m²s	mm/day

Table 2. Contingency table with threshold on the reference rainfall rates.

	R_evak > 0.1 mm/d	R_est ≤ 0.1 mm/d
R_Ref > th	H	M
R_Ref ≤ th	F	Z

Table 3. Contingency table with threshold on the estimated rainfall rates.

	R_eval > th	R_eval ≤ th
R_Ref > 0.1 mm/d	H	M
R_Ref ≤ 0.1 mm/d	F	Z

Table 4. Percentiles for each precipitation product (for precipitation rates greater than 0.1 mm/day).

Percentile	ERA5-MAX	ERA5-MIN	MERRA2	WLDAS	IMERG	Ground
10	0.16	0.14	0.15	0.15	0.13	0.15
50	1.11	0.62	0.68	0.87	0.50	0.98
75	3.60	2.04	2.14	2.80	1.39	2.55
90	9.06	6.81	5.96	7.61	4.27	7.17
95	16.50	12.67	11.02	12.60	9.32	12.72
99	41.64	28.18	24.36	27.27	27.41	27.45

Table 5. Performance ranking when measuring average precipitation (research question 1).

	ERA5-MAX	ERA5-MIN	MERRA2	WLDAS	IMERG
CDF	1	4	3	2	5
Bias Ratio	5	4	3	1	2
Mean Error	5	4	3	2	1
Corr. Coeff.	1	2	4	2	3
RMSE	5	2	3	1	4
SUM	17	16	16	8	15

Table 6. Performance ranking when measuring extreme precipitation (research question 2).

	ERA5-MAX	ERA5-MIN	MERRA2	WLDAS	IMERG
CDF	5	1	3	2	4
Bias Ratio	5	2	4	1	3
Mean Error	5	2	3	1	4
Corr. Coeff.	1	4	5	2	3
RMSE	5	2	4	1	3
SUM	21	11	19	7	17

Table 7. Performance ranking when detecting overall precipitation (research question 3).

	ERA5-MAX	ERA5-MIN	MERRA2	WLDAS	IMERG
Wet days %	5	2	3	1	4
POD	1	4	3	5	2
FAR	5	1	3	2	4
Missed Precip.	1	4	2	5	3
False Precip.	3	1	2	5	4
ETS	4	1	3	2	5
TSS	2	1	3	5	4
SUM	21	14	19	25	26

Table 8. Performance ranking when detecting extreme precipitation (research question 4).

	ERA5-MAX	ERA5-MIN	MERRA2	WLDAS	IMERG
POD	2	3	1	5	4
FAR	5	1	3	4	2
Missed Precip.	2	3	4	5	1
False Precip.	3	1	4	5	2
ETS	5	1	3	2	4
TSS	5	1	2	4	3
SUM	22	10	17	25	16

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Brobst-Whitcomb, N.; Maggioni, V. How Do Different Precipitation Products Perform in a Dry-Climate Region? Atmosphere 2026, 17, 5. https://doi.org/10.3390/atmos17010005

AMA Style

Brobst-Whitcomb N, Maggioni V. How Do Different Precipitation Products Perform in a Dry-Climate Region? Atmosphere. 2026; 17(1):5. https://doi.org/10.3390/atmos17010005

Chicago/Turabian Style

Brobst-Whitcomb, Noelle, and Viviana Maggioni. 2026. "How Do Different Precipitation Products Perform in a Dry-Climate Region?" Atmosphere 17, no. 1: 5. https://doi.org/10.3390/atmos17010005

APA Style

Brobst-Whitcomb, N., & Maggioni, V. (2026). How Do Different Precipitation Products Perform in a Dry-Climate Region? Atmosphere, 17(1), 5. https://doi.org/10.3390/atmos17010005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

How Do Different Precipitation Products Perform in a Dry-Climate Region?

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Datasets

2.3. Data Analysis

3. Results

3.1. Continuous Statistics

3.2. Contingency Metrics

3.3. Dataset Ranking

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI