Assessment of the Extreme Precipitation by Satellite Estimates over South America

: In developing countries, accurate rainfall estimation with adequate spatial distribution is limited due to sparse rain gauge networks. One way to solve this problem is the use of satellite-based precipitation products. These satellite products have signiﬁcant spatial coverage of rainfall estimates and it is of fundamental importance to investigate their performance across space–time scales and the factors that affect their uncertainties. In the open literature, some studies have already analyzed the ability of satellite-based rain estimation products to estimate average rainfall values. These investigations have found very close agreement between the estimates and observed data. However, further evaluation of the satellite precipitation products is necessary to improve their reliability to estimate extreme values. In this scenario, the main goal of this work is to evaluate the ability of satellite-based precipitation products to capture the characteristics of extreme precipitation over the tropical region of South America. The products evaluated in this investigation were 3B42 RT v7.0, 3B42 RT v7.0 uncalibrated, CMORPH V1.0 RAW, CMORPH V1.0 CRT, GSMAP-NRT-no gauge v6.0, GSMAP-NRT- gauge v6.0, CHIRP V2.0, CHIRPS V2.0, PERSIANN CDR v1 r1, CoSch and TAPEER v1.5 from Frequent Rainfall Observations on GridS (FROGS) database. Some products considered in this investigation are adjusted with rain gauge values and others only with satellite information. In this study, these two sets of products were considered. In addition, gauge-based daily precipitation data, provided by Brazil’s National Institute for Space Research, were used as reference in the analyses. In order to compare gauge-based daily precipitation and satellite-based data for extreme values, statistical techniques were used to evaluate the performance the selected satellite products over the tropical region of South America. According to the results, the threshold for rain to be considered an extreme event in South America presented high variability, ranging from 20 to 150 mm/day, depending on the region and the percentile threshold chosen for analysis. In addition, the results showed that the ability of the satellite estimates to retrieve rainfall extremes depends on the geographical location and large-scale rainfall regimes.


Introduction
Precipitation is one of the most important meteorological variables to investigate in the hydrological cycle context. When precipitation occurs in a very intense or extreme way, it can cause the National Institute for Space Research (INPE) for the period 2012-2016. To compare SRPs and rain-gauge data, maximum values and 99th percentiles of time series were used to identify extreme events. Statistical metrics such as bias, correlation and mean square error were used.

Study Area
South America has an area of about 17.8 million km 2 , contains 6% of the world's population, and is divided into 12 countries. The precipitation in South America is influenced by a large number of meteorological systems, which in turn can be influenced by the topographic characteristics in the region, as shown in Figure 1. A review of the atmospheric systems and a schematic representation of all the systems that occur in the low and high troposphere over South America is available in [23]. The domain analyzed in this investigation is contained in the coordinates from 30 • S to 15 • N and from 90 • W to 30 • W, corresponding to the tropical region of South America. We considered a period of five years, ranging from 2012 to 2016.

Rainfall Satellite-Based Products
The Frequent Rainfall Observations on GridS (FROGS) dataset (DOI:10.14768/06337394-73A9-407C-9997-0E380DAC5598), which is composed of daily precipitation gridded products that include satellites, ground-based and reanalysis products adjusted to a common 1 • × 1 • grid resolution, were used in this study. The dataset was downloaded from the ESPRI/IPSL repository.
Developed by [22], the FROGS database was built to put all products from different groups such as NASA, NOAA, JAXA among others, in a common resolution. Each product was originally formulated with different purposes, considering different areas and different time scales and resolutions. Some products contain global information, others quasi-global and still others specific regions. Also, there are products that consider only the ocean or land areas, and others that consider both. For this reason, the FROGS dataset was built to offer accessible data on daily precipitation products with a common grid of 1 • × 1 • , to provide support to clarify some uncertainties that are inherent in all precipitation products by making them comparable. The FROGS database is formed by 32 products, where the files are produced in NETCDF-4 format and contain information such as longitude, latitude, time and rain intensity. To perform this investigation, we considered the subset of SRPs for the period ranging from 2012 to 2016.
The SRPs selected in this study are described below. The 3B42RT is a near-real-time product, also known as the TRMM multi-satellite Precipitation Analysis (TMPA) algorithm -Real-Time [24], where passive microwave (PMW) and infrared (IR) data are combined and the historical rain gauge information is incorporated in the calibrated product; The Global Satellite Mapping of Precipitation (GSMAP) product [25] is based on microwave estimation of rainfall but also use IR geostationary imagery to extrapolate the PMW estimates. Rain gauge observations are used to correct for bias; The Climate Prediction Center morphing technique (CMORPH) [26] is a product that combines PMW and IR data to 'morph' the PWM estimated fields. A bias correction technique, using Climate Prediction Center (CPC) rain gauge analysis [27], is applied over land surfaces for the V1.0 CRT version; The Combined Scheme approach (CoSch) [28] is a product that uses the Real-Time-TRMM Multi-satellite Precipitation Analysis (TMPA) algorithm and rain gauge data from the Global Telecommunications System (GTS) and multiple institutions in Latin America to remove the bias from TMPA; The Tropical Amount of Precipitation with an Estimation of ERror (TAPEER) is a product based on the GOES precipitation index technique [29], which merges geostationary infrared images with microwave instantaneous rain rate estimates; The Climate Hazards Infrared Precipitation (CHIRP) [30] is a product based on infrared observations from geostationary observations in a GOES. The CHIRPS product is a merger of rain station information with CHIRP estimates using a weighted average of the closest stations; The Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR) [31] is an infrared-based product that uses neural networks to obtain rain rate information.

Rainfall Ground-Based Data
In this study, two different ground-based databases were analyzed. The first database came from Global Precipitation Climatology Center (GPCC), which is included in the FROGS product. This product is available in a grid with a resolution of 1 • x1 • , considers rainfall information from around the globe from the Global Precipitation Climatology Center (GPCC), and uses a modified SPHEREMAP interpolation scheme [32]. The second database was provided by Brazil's National Center for Space Research (INPE) and contains data from automatic and conventional rain gauges/stations from different networks in South America. This database is in DAT format and has been interpolated for regular 1 • x1 • grids using the simple average and then converted to NETCDF format, ensuring the consistency of spatial and temporal resolutions for comparisons.
In order to consider only the grid points representative of the entire time series period, only 10% of the missing data was allowed for each grid point. Figure 2a,c show the percentage of the number of days where at least one gauge was reported for that grid point in the GPCC and INPE databases, respectively, and Figure 2b,d identify the grid points with a frequency larger than 90% from Figure 2a,c. In the latter case, the GPCC and INPE databases have 220 and 345 valid grid points respectively.
Grid points with more than 90% of days with gauge information (rain ≥ 0.0) were used. In this analysis, 220 grid points were accounted for from the GPCC FROGS database and 345 grid points from the INPE database. Analysis of the rain gauge data sets for GCCP and INPE revealed that the INPE database had greater spatial coverage. Comparison between the two datasets (not shown here) indicated a small difference between them. For this reason, we decided to choose the dataset provided by INPE to perform this work. So, the INPE rainfall database will be used hereafter as the reference for validation of satellite precipitation estimates for different products. It is important to mention there is no complete coverage of rain gauges in the study area in both databases considered. However, the INPE database covered a more complete data network with more grid points represented by rain gauges.
To ensure data quality, INPE performs an automatic data quality control that considers four steps: (i) range test, which seeks to eliminate gross errors outside the confidence interval; (ii) step test, which considers a maximum difference between consecutive values; (iii) internal consistency test, which makes an association between different meteorological parameters; and (iv) persistence test, which identifies the variability of measurements over a long period of time [33].

Assessment of Satellites Products : Statistical Analysis
Statistical analysis was performed to compare satellite data with the gauge-based rainfall data. The objective of this comparison was to evaluate the ability of different algorithms to estimate extreme rainfall. In doing so, we considered the following statistical measures [34][35][36][37]: (a) The correlation coefficient (r) refers to the agreement between satellite-based rainfall and gauge-based rainfall. The correlation coefficient ranges between −1 and +1. The value of +1 indicates a perfect positive fit, in other words, a perfect linear correlation. We use the term positive correlation when r > 0, in which case as P i G grows, P i S also increases, and negative correlation when r < 0, and in this case as P i G grows, P i S decreases on average.
Where: P i G is the gauge-based value at pixel i P i S is the satellite-based precipitation value at pixel i n is the number pixels included in the analysis.
(1) (b) RMSE: Root Mean Square Error is one of the most commonly used methods to measure the absolute average error and is sensitive to larger errors.
(c) BIAS indicates the average tendency of the satellite-based rainfall fields to be larger or smaller than the rain gauges; the best value is 0; negative (positive) values indicate an underestimation (overestimation).
(d) Standard Deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.
Taylor diagrams were used to facilitate comparison between the data estimated by different models and the reference data [38][39][40]. These diagrams are used to quantify the degree of correspondence between the modeled and observed data in terms of three statistics: Pearson correlation coefficient (r), root mean square error (RMSE), and standard deviation (std). The x and y axes represent the standard deviation of the estimated and observed data, respectively. The green arc represents the RMSE and the arc on the right side denotes positive Pearson correlation coefficient values. The colored circles inside the diagram represent the models (satellite products) and the white circle, over the x-axis, represents the reference data (INPE). The closer a model is to the observation point, the better the satellite product is. To compare the statistical behavior of the extreme precipitation values of the different satellite products, the graphs of the probability density function (PDF) and empirical cumulative distribution function (ECDF) of the maximum rainfall values and several percentile values were generated. To estimate the density curves, the non-parametric kernel method was used [41,42].
Denoted by f x (x), the PDF describes the behavior, in polygonal form, of the frequency distribution of a random variable. In this study, maximum rainfall and extreme rainfall threshold are represented by X. The probability (Prob) of the random variable being less than a given value of interest is calculated using the CDF (Fx(x)), represented by Equation (5): Inversely, the corresponding PDF can be obtained by differentiating Equation below: The CDF of a continuous random variable is a non-decreasing function, and the validated expressions are: F x (−oo) = 0 and F x (+oo) = 1 [43][44][45][46].
In addition to the statistical measures and graphical analysis of PDF and ECDF, a statistical test was also used. A preliminary analysis was carried out to choose between a parametric test, where the assumptions of normality, independence and constant variance must be satisfied, and a non-parametric test, where these assumptions do not all hold. The databases used were independent, but did not satisfy the other assumptions, so the Wilcoxon-Mann-Whitney test was chosen [47]. This is a non-parametric test employed as an alternative to the Student t-test (parametric cases) for two independent samples [48][49][50]. The objective is to test whether two distributions are equal in location, that is, whether the data distribution (observed and estimated) has the same median, where in this case the medians of the extreme values were evaluated. To apply the Wilcoxon-Mann-Whitney test, it is assumed that F and G are the distribution functions corresponding to the observed and estimated data, and the null hypothesis states that the distributions are equal, H 0 : F(X) = G(Y). The test statistic is calculated by the equation below, ( [47]): where m and n represent the number of elements in samples X and Y respectively; and S m denotes the sum of the ranks related to the sample elements X. To decide whether or not to reject the null hypothesis, the p-value (probability of obtaining a test statistic equal to or greater than the value observed in a sample) is calculated. When this is below the desired significance level (here 5%), the null hypothesis is rejected.

Characterization of Extreme Rainfall
Extreme rainfall is one of the most severe weather hazards affecting the globe. To investigate these extreme events, statistical tools can be used to interpret a dataset and to find a constant threshold. This threshold is based on the empirical distribution of the variable at each location so as to ensure that a given fraction of events will by definition be extreme.
The most common way to choose a threshold is to use the quantiles as cutoff points that divide the range of a probability distribution into contiguous intervals with equal probabilities. But there is still no consensus in meteorology about the precipitation thresholds for the identification of extreme events, so this is highly variable. Liebmann et al. [51] studied the interannual variability of the events of daily extreme precipitation in the state of São Paulo and defined as extreme events those in which the daily precipitation exceeds a percentage of its seasonal or annual average. In turn, [52] defined the criterion of a 50 mm/day isohyetal line enclosing an area of not less than 10,000 km 2 in the domain of southern Brazil. Dereczynski et al. [53] studied intense rainfall events in Rio de Janeiro during 1997 to 2006 and applied the percentile technique for each rain station. They observed a threshold higher than 30 mm for the daily precipitation totals corresponding to the 99th percentile of all rain stations considered. Alves et al. [54] investigated the main atmospheric characteristics associated with extreme precipitation events in northeastern Brazil and defined extreme events as daily accumulation of precipitation with values greater than or equal to 50 mm.
In this work, we applied the quantile technique by establishing values above a certain threshold. Using the empirical cumulative distribution function (ECDF), we carried out a preliminary study with the thresholds of 99.0, 99.5 and 99.9 percentiles and the maximum value of the distribution of precipitation over tropical South America during the five-year period.
In this study, the thresholds of 99.0 percentile (p99th, hereafter) and maximum value were chosen to characterize extreme precipitation events over South America. As shown in Figure 3a,d, there was a slight difference between the two approaches. While almost all algorithms underestimated the maximum value and there were large discrepancies among them, the p99th, even though considered to be an extreme value from a statistical point of view, showed better agreement among the algorithms when compared with the reference data. The first situation could be because the maximum values are highly sensitive to unfiltered outliers or very rare events in the probability distribution of the reference data and/or the FROGS dataset. The maximum and p99th are explored in Section 3.

Results And Discussion
Since the launch of the first satellite completely focused on precipitation studies (TRMM), the academic community has been evaluating the performance of satellite precipitation estimates obtained by different algorithms [24,[55][56][57][58]. For a long time, these efforts were focused on obtaining the average rainfall characteristics on a global basis. It was found that, for average values, the estimates showed a pattern similar to the observed rainfall [59]. However, this does not happen with extreme rainfall values, where satellite estimates tend to underestimate or overestimate retrievals depending on the physical characteristics of rainfall systems (deep vs. shallow convection systems) and other regional factors like orography and local circulation patterns, among others [11].
To summarize the inter-product differences, zonal mean precipitation was performed ( Figure 4) to aggregate the multiple measurements at each latitude band (2 • bins) by plotting the mean rainfall. Note that for the average values, the estimates are close to the reference values (in black), presenting larger dispersion in equatorial latitudes (5 • S to 10 • N) and close to the southern edge of the region between 25 • S to 30 • S. According to the climatology, these two regions presented the largest rain accumulation in the studied region [60]. However, this disagreement among different algorithms and the reference data could be also due to a data sampling issue. While the central region has the largest amount of valid grid points per latitude band (see Figure 2d), the equatorial region has the smallest number of valid grid points. The southern region is in an average situation regarding the number of valid points (Figure 4). 30 25 20   While there is a reasonable agreement for the mean values, large differences are observed in the ECDF of the extreme values, as can be noted in Figure 3. Therefore, approaches to extreme precipitation analysis are described in the next two subsections: (i) maximum daily values, and (ii) p99th obtained for each database and the reference.

Extreme Rain from Maximum Daily Values
The same zonal mean analysis was performed for the maximum estimated values for each product and for the reference data, as shown in Figure 5. All products tended to underestimate the maximum precipitation values for almost all latitude bands, and larger dispersion was noticed for the same regions, as described for the mean values in Figure 4.
In equatorial latitudes (5 • S to 10 • N), there was an abrupt variation in the maximum values of rain, as can be seen in the reference data (black line), with variation from 60 mm to 120 mm in just a few degrees of latitude. This variation may be associated with the short-lived convective systems that are common in this region [61]. Also, the algorithms were unable to simulate the same behavior. In the region that comprises latitudes between 5 • S to 20 • S, there was minor variation of the maximum rainfall values, which were all near 100 mm. All algorithms underestimated the maximum rainfall in this central region of South America, but the dispersion among them was small. Latitudes between 20 • S to 30 • S had the highest values for maximum rainfall and tended to underestimate these values, while others were overestimated. This behavior will be analyzed later in this section.  Figure 6 shows the frequency of maximum rainfall using the probability distribution function (PDF) of satellite products in South America. According to this plot, the most frequent values were between approximately 75 mm/day and 100 mm/day for the studied region. As shown in the previous analysis, satellite estimates tended to underestimate, with several degrees of disagreement, the maximum values of the PDF.
According to the results, the distributions of the products 3B42, GSMAP, GSMAPg, CMORPHg, COSCHg presented the closest agreement with reference database's PDF, as shown in Figure 6. However, according to the Wilcoxon-Mann-Whitney test, the distributions of the maximum precipitation values estimated by the GSMAP and 3B42g products were statistically equivalent to the distribution of the maximum precipitation values observed according to the reference database (p-value > 0.05). The data distributions estimated by the 3B42g and GSMAP products had medians of 84.11 mm and 77.27 mm, respectively. The reference dataset (INPE) was closest to the median of the data distribution (89.20 mm). On average, the products with the smallest biases were 3B42g, GSMAP and TAPEER, with −3.05 mm, −3.82 mm, and −7.79 mm respectively. As mentioned, the Taylor diagram provides a concise statistical summary of how well patterns match each other in terms of their correlation, root mean-square error and standard deviation. In this case, the shortest Euclidean distance to the reference data represents the closest agreement. COSCHg presented the best agreement when compared with other algorithms, as shown in Figure 7. According to the plot, COSCHg presented the best metrics compared with other products, with the highest correlation coefficient (r = 0.72) and the smallest standard deviation and root mean square error (std = 26.9 and RMSE = 24.5).
COSCHg presented the closest metrics when compared with the reference data, with correlation coefficient (r) = 0.72, standard deviation (std) = 26.9, and root mean square error (RMSE) = 24.5 for the entire region. These results were expected based on [28], since the COSCHg product merges satellite estimates with daily gauge data. In this case, TMPA is used as a high-quality rainfall algorithm and a very similar database (that used as a reference), composed of daily rain gauge observations, is used to correct the bias of near real-time over South America. As shown in Figure 7, bias was negative for South America, indicating that all products tended to underestimate maximum rainfall values.  Figure 5 shows the distribution of extreme rainfall differences with latitude and behaviors for each product. In Figure 8, the spatial distribution of maximum rainfall (mm/day) occurred in each valid pixel during the period analyzed for all SRPs and the reference data. The maximum spatial pattern values showed a clear delimitation of maximum values for almost all SRPs. In the reference dataset ( Figure 8M), the largest maximum rainfall values were achieved in the southern and northern regions of Brazil, while the smallest values were observed in the semiarid region of northeastern Brazil. This pattern was also observed, with different degrees of agreement, in all studied algorithms.
The behavior mentioned above may be associated with the different types of meteorological systems that occur in each region [23] as well as the composition of hydrometeors within the clouds. In the southern region, clouds with significant ice content and deep convection, characteristics are typically observed [15], while at the northeastern region clouds generally have shallow convection with little or no ice content [12]. INPE   0  12  24  36  48  60  72  84  96  108  120 Maximum rainfall [mm/day] Still referring to Figure 8, it shows differences between the estimated maximum values and the observed maximum values. All products tended to underestimate extreme rainfall values. However, products such as CHIRP ( Figure 8A) and PERSIANN ( Figure 8J) tended to underestimate these values in all regions. This could be due to the fact that both rely on infrared information for rainfall retrieval. Salio et al. [11] showed that other IR-based algorithms (Hydro Estimator in that case) have difficulty in retrieving the largest precipitation values.
The products without rainfall adjustments tended to underestimate the lowest values and overestimate the highest values, as it can be seen in the southern and the northeast regions of Brazil respectively, and as demonstrated in Figure 8A-D and Figure 8L). The results are similar to the results found by [62] for East Africa. According to the authors, the performance of satellite products is weaker when rainfall intensity is high, and after adjustments, they observed a closer estimate in some regions. Figure 9 depicts the difference between the maximum rain in the reference data and the maximum rain estimated by each product at each grid point. Some products overestimated the maximum values in some regions (reddish colors) while underestimating them in others (bluish colors).
While some products (CHIRPS, CHIRP and PERSIANN, among others) underestimated the maximum value for almost the whole region, others had different behavior according to the region. GSMAP, 3B42g and TAPEER tended to overestimate the maximum values over the southern region while underestimating them in the central region. The regional analysis will be investigated in the near future to establish a relationship between anomalies of large and regional circulation patterns and extreme values.

Extreme Rain from 99th Percentile Threshold
The second approach used to analyze extreme precipitation over South America was application of the 99th percentile threshold. The zonal analysis was performed for the p99th values of each precipitation estimation product and reference data (in black), as shown in Figure 10. According to this plot, most of the products tended to underestimate p99th and the pattern is very similar to the maximum values with larger dispersion in the southern region between 30 • S-25 • S and close to the equator. Figures 5 and 10 indicate that both maximum and p99th values for products such as GSMAP, CMORPH, 3B42 and TAPEER overestimated the reference values between 30 • S-20 • S. These results are similar to those obtained by [5], who used other reference data sources (GPCC v2018 and CPC v1.0), considered all points on ocean/land and interpolated points. In the central region, underestimation is present in all products, with a smaller dispersion due to the larger number of valid points considered, while the equatorial region is noisier and has larger dispersion due to the smaller number of valid point considered in the comparison and the complex behavior of the rainfall systems over the Amazonian region. A similar analysis was performed for the maximum values, as shown in Figure 11. This plot depicts the frequency of the 99th percentile for rainfall using the probability distribution function (PDF) of satellite products from South America. According to the reference data, the most frequent p99th daily values were between 30 mm/day and 50 mm/day. This analysis considered daily rainfall values ≥ 0.0 mm in order to analyze the p99th for each grid point, and the same amount of data was present in each distribution.
Still referring to Figure 11, according to the PRF's PDF, each product had a different maximum in the frequency (mode). In general, all products tended to underestimate the frequency of p99th rainfall values. The distributions of the products GSMAPg, CMORPHg, and COSCHg had the best agreement compared to the observed PDF (INPE) over South America. However, according to the results of the Wilcoxon-Mann-Whitney test, the distributions of the p99th threshold of precipitation values estimated by the 3B42g and TAPEER products were statistically equivalent to the distribution of the p99th precipitation values observed by INPE, with p-value > 0.05. Other products had a p-value < 0.05. The data distributions estimated by the 3B42g and TAPEER products had a median of 38.5 mm and 38.6 mm, respectively, close to the median of the reference data distribution, which was 39.5 mm. On average, the products that showed the smallest biases were 3B42g and TAPEER, with 1.07 mm and 0.19 mm, respectively. According to Figure 12, when statistical metrics such as correlation coefficient, mean square error and standard deviation were applied to the COSCHg product, better performance was achieved in comparison with the reference data. This Taylor diagram describes the performance of satellite-derived extreme rainfall estimates in comparison with gridded rain gauge data and shows the correlation coefficient, standard deviation and mean square error values. The bias of p99th over South America is negative for the majority of the products, indicating they tended to underestimate p99th threshold rainfall values. However, exceptions occurred for TAPEER (1.07 mm) and 3B42g (0.19 mm). Figure 13 shows the spatial distribution of the 99th percentile in each pixel for estimates and the reference data. In addition, Figure 9 shows the bias related to these data. While the spatial pattern observed is very similar to the maximum precipitation (Figure 8), the values are well below the maximum values with a smoother distribution. This situation can be explained because in this analysis, all rainfall values (including 0.0 mm) were considered in establishing the p99th. In very dry regions like the continental region of northeastern Brazil, where the number of rainy days (defined as days with accumulated rainfall greater than 1 mm) is just 35 per year [60], the p99th could be affected by the fact that this situation does not represent an extreme value in terms of impacts on the society. Figure 14 shows that some products overestimated the p99th values in some regions (reddish colors) while underestimating them in others (bluish colors), as observed by the maximum values. Some products, such as CHIRP, CHIRPSg, COSCHg, PERSIANNg, and CMORPHg, tended to underestimate the p99th value over South America, while 3B42, 3B42g, GSMAP, CMORPH and TAPEER tended to overestimate the p99th values over the southern region while underestimating then in the central region.

Conclusions
Extreme rainfall events can cause several social, economic and environmental problems. Studies that improve understanding of the characteristics of these events are important. The main objective of this study was to evaluate a large number of satellite products and identify their capacity to detect extreme precipitation events. In addition, the extensive spatial coverage of data observed by the rain gauge network in South America was considered to compare these data with satellite products. In this scenario, the comparison between extreme precipitation values of 11 different satellite products and rain gauge data was carried out for the tropical region of South America. Precipitation extremes were defined by maximum values and the 99th percentile threshold in a daily 1 • × 1 • grid for the period from 2012 to 2016. The ability of the satellite-based precipitation products to detect extreme precipitation was analyzed using statistical metrics. The SRPs evaluated were CHIRP v2.0, CHIRPS v2.0, 3B42 RT v7.0 uncalibrated, 3B42 RT v7.0, GSMAP-NRT-no gauges v6.0, GSMAP-NRT-gauges v6.0, CMORPH v1.0 RAW, CMORPH v1.0 CRT, PERSIANN CDR, CoSch, and TAPEER v1.5, using data from the rain gauges of the INPE database as the reference.
The evaluation of different SRPs showed that the adjusted rain gauge products had better performance than near real-time product versions. This suggests that satellite-based estimates combined with gauge information would be effective to identify extreme precipitation. Similar results were found by [19] and [37], who showed that post-real-time products agreed well with the gauge-based observations and presented satisfactory performance for China. In addition, [63] compared the performance of TRMM, IMERG-F, and GSMaP-Gauge in Brazil and demonstrated that IMERG-F and GSMaP-Gauge presented better performance in comparison with near-real-time products. However, according to this investigation, this is most likely only in regions where rain gauge data are available.
The main findings of this study reveal that the products without rainfall adjustments tended to underestimate the lowest values and overestimate the highest values in the southern and northeastern regions of Brazil. This corroborates the results of [62] for East Africa, where the performance of satellite products was found to be weaker when rainfall intensity is high, and, after adjustments a closer estimate was observed in some regions. However, one of the big advantages of satellite data is availability for regions without rain gauges. In this study, we observed that GSMAP and TAPEER, which are SRPs that are not adjusted by rain gauge data, presented better performance when the maximum values occurred between 75 mm and 100 mm when compared with the other SRPs. We also observed that the performance of the SRPs that include MW measurements was better than those with IR information only. This was seen in the case of CHIRP, CHIRPSg and PERSIANNg, which were the products that most underestimated the extreme values of precipitation and that had the lowest correlation coefficient (r < 0.3) for maximum values. In contrast, GSMAP, 3B42g and TAPEER, which have blended techniques, tended to overestimate the maximum values in the southern region. This leads to the hypothesis that the use of extreme precipitation estimates derived from satellite products should not be considered globally, but regionally, as the performance of each product varies according with the particularities of each region. In this way, extreme rain thresholds vary greatly from region to region.
The most frequent values of extreme rainfall in South America were between 75 mm/day and 100 mm/day, but all products analyzed tended to underestimate extreme precipitation for almost all latitude bands in this region. In particular, we noted a spatial pattern of the extreme values of precipitation. The maximum and 99th percentile extreme rainfall estimates had similar features, where the extremes with the most intense volume were in the northern and southern regions of Brazil while the lowest values were observed in the semiarid region of northeastern Brazil. This spatial pattern was detected by all SRPs evaluated, with different degrees of agreement. This behavior may be associated with the different types of meteorological systems that occur in each region [23], as well as the composition of hydrometeors within the clouds. While in the southern region, clouds with significant ice content and deep convection characteristics are often observed [15], the northeastern region is characterized by shallow convection with little or no ice content [12].
Our objective was to identify the characteristics of extreme rain events represented by different satellite databases. However, other analyses are necessary. In future works, the events of extreme precipitation for South America will be analyzed by date of occurrence, based on the observed rainfall data and for each specific event to validate satellite data, allowing us to show which product performs best according to the meteorological condition on the specific day. This will enable us to identify which physical thermodynamic processes favor the occurrence of rainfall extremes in different regions of South America.
Author Contributions: R.S.A.P. conceived the structure of this paper writing original draft, D.A.V. contributed to supervision and to the discussion of scientific problems, D.T.R., D.P.Q., R.A.d.S. and J.M.d.S.A. contributed by helping the first author with data processing. R.C.P. helped in the discussions and checked the English language of the manuscript. All the authors conducted the manuscript revision and the analysis of the results. All authors have read and agreed to the published version of the manuscript.
Funding: This study was financed in by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPQ) and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Brazil (CAPES)-Finance Code 001.