Assessment on IMERG V06 Precipitation Products Using Rain Gauge Data in Jinan City, Shandong Province, China

: In this study, a comprehensive assessment on precipitation estimation from the latest Version 06 release of the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (IMERG) algorithm is conducted by using 24 rain gauge observations at daily scale from 2001 to 2016. The IMERG V06 dataset fuses Tropical Rainfall Measuring Mission (TRMM) satellite data (2000–2015) and Global Precipitation Measurement (GPM) satellite data (2014–present), enabling the use of IMERG data for long-term study. Correlation coefﬁcient (CC), root mean square error (RMSE), relative bias (RB), probability of detection (POD), false alarm ratio (FAR), and critical success index (CSI) were used to assess the accuracy of satellite-derived precipitation estimation and measure the correspondence between satellite-derived and observed occurrence of precipitation events. The probability density distributions of precipitation intensity and inﬂuence of elevation on precipitation estimation were also examined. Results showed that, with high CC and low RMSE and RB, the IMERG Final Run product (IMERG-F) performs better than two other IMERG products at daily, monthly, and yearly scales. At daily scale, the ability of satellite products to detect general precipitation is clearly superior to the ability to detect heavy and extreme precipitation. In addition, CC and RMSE of IMERG products are high in Southeastern Jinan City, while RMSE is relatively low in Southwestern Jinan City. Considering the fact that the IMERG estimation of extreme precipitation indices showed an acceptable level of accuracy, IMERG products can be used to derive extreme precipitation indices in areas without gauged data. At all elevations, IMERG-F exhibits a better performance than the other two IMERG products. However, POD and FAR decrease and CSI increase with the increase of elevation, indicating the need for improvement. This study will provide valuable information for the application of IMERG products at the scale of a large city.


Introduction
In recent years, the hydrological cycle has been under the influences of global climate change and intensified human activity [1]. Spatial and temporal distribution of the components of the cycle have changed, directly affecting regional water balance and inducing natural disasters, such as high intensity rain events, flood, a heat wave, and drought [2][3][4].
Variability in precipitation can lead to regional droughts and floods, which is crucial to water resources management and to meeting the needs of human societies [5]. To forecast floods, monitor droughts, and manage emergencies associated with natural disasters, it is critical to have high-resolution precipitation data [6,7]. Precipitation data are also used as basic drivers in various hydrological models. Accuracy of these input data is particularly important. Precipitation data are usually collected using ground-level rainfall gauges, radar [8], or satellite sensors. The TRMM satellite was in operation from 1997 to 2015, and was replaced by the Global Precipitation Measurement (GPM) mission, which was launched on 28 February 2014. The IMERG algorithm combines information about the GPM satellite constellation to estimate precipitation over the majority of the Earth's surface. In the latest Version 06 release of IMERG, the algorithm fuses the early precipitation estimates collected during the operation of the TRMM satellite (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) with more recent precipitation estimates collected during operation of the GPM satellite (2014-present). Before satellite precipitation products are used in hydro-meteorological research, their errors need to be quantified and corrected.
Recently, many preliminary evaluations of satellite precipitation products at different spatial and temporal scales have been conducted. Tian et al. [21] found that the performance of GSMaP is comparable to that of other satellite-based products with GSMaP having a slightly higher detection probability during the summer over the contiguous United States. Alijanian et al. [28] conducted a spatio-temporal drought assessment using precipitation products of Artificial Neural Networks Climate Data Record (PERSIANN-CDR) and the Multi-Source Weighted-Ensemble Precipitation (MSWEP) over Iran. Tekeli et al. [29] pointed out that the TRMM Multi-satellite Precipitation Analysis (TMPA) Real Time (RT) data (3B42RT) could be used for flash flood forecasting. A study by Haile et al. [30] suggested that CMORPH can capture the seasonal and spatial patterns of rainfall over Lake Tana basin in eastern Africa, but with varying degrees of accuracy that depend on topography, latitude, and lake-versus-land conditions within the basin. However, only a few studies have focused on extreme rainfall events [31,32].
The IMERG is the successor of TMPA, and the global products of IMERG have also been evaluated. Some studies have compared GPM and TRMM products [14,[33][34][35][36][37]. For example, Wang et al. [38] indicated that IMERG correlates better with observations than TMPA, whereas the bias of IMERG is larger than that of TMPA at multiple temporal and spatial scales over Northeastern Tibetan Plateau. Liu et al. [39] compared IMERG and TMPA monthly products on a global scale, and found that differences between IMERG and TMPA vary with surface types and precipitation rates. Many studies have also found that the complex terrain can also affect the accuracy of satellite precipitation estimation [40,41]. However, they have rarely used the latest version 06 release of IMERG. These earlier studies are based on IMERG precipitation products that span over only a few years [42,43]. A comprehensive assessment of satellite-based precipitation products at the scale of a megacity is lacking [44]. The IMERG V06 dataset is the latest IMERG products with a relatively long time series. It also has a high 0.1-degree spatial resolution. Long time series data could be used for other research. However, it is necessary to evaluate the applicability before using it.
In this study, the land-based observations, continuous verification statistics (correlation coefficient (CC), root mean square error (RMSE), and relative bias (RB)), and categorical verification statistics (probability of detection (POD), false alarm ratio (FAR), and critical success index (CSI)) were used to systematically assess the performance of three IMERG V06 products-IMERG Early Run, IMERG Late Run, and IMERG Final Run-between 2001 and 2016 in Jinan City. Using several extreme precipitation indices, the performance of IMERG products was also assessed with respect to extreme precipitation events.
The remaining sections of this paper are organized as follows. The study area and datasets are presented in Section 2. Continuous and categorical verification statistics and extreme precipitation indices are introduced in Section 3. The three IMERG products are assessed in Section 4. Sections 5 and 6 present the discussions and conclusions, respectively.

Study Area
Jinan City, the capital of Shandong Province in China, is located in Central Eastern China, downstream of the Yellow River Basin and north of Mount Tai, between the 36 • 10 N and 37 • 40 N latitudes and the 116 • 12 E and 117 • 44 E longitudes. As one of the first pilots, Sponge Cities (a new generation of urban stormwater management concept in China), Jinan City has a surface area of 8177 km 2 . It is covered by mountains, hills, and plains. The elevation is higher in the southeast and lower in the northwest ( Figure 1). It had a population of more than 7 million in 2016.
Jinan City is located in the Northern Hemisphere at the middle latitudes, and has a temperate continental monsoon climate. Average annual temperature is between 13 and 15 • C [45]. Mean annual precipitation in the study area is approximately 636 mm [46,47]. Precipitation is abundant between June and September. Precipitation distribution is controlled by monsoon intensity and inter-annual variability in monsoon transit time. Precipitation is higher in strong summer monsoon years.

Land Gauge Precipitation Data
Daily rain gauge data collected at 24 rain gauges of Jinan City between 2001 and 2016 were used in this study. The data were provided by the Jinan Hydrology Bureau. Although the IMERG-F product has been adjusted according to Global Precipitation Climatology Center (GPCC) products derived from global stations data, the 24 rain gauges used in this study were not included. In addition, these data are recorded according to equivalent liquid precipitation when the precipitation has fallen as snow. Li et al. [47] used these data to study the precipitation characteristics of Jinan and proved that these data are independent and have a good consistency. Station locations and elevations are shown in Figure 1. Detailed information about 24 land observation stations are provided in Table 1. Beifeng (BF) station and Gushan (GS) station lacked monitoring data for dozens of days. The missing data of BF station is supplemented with the arithmetic mean method by using the data of three of the nearest rain gauges (Qunjing, Dazhan, and Sandefan), while the data of Changqing, Shaoer, and Wohushan is used in the Gushan (GS) station.

Satellite Precipitation Products
The GPM mission was initiated by National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA) as a global successor of TRMM. It is an international network of satellites that provide next-generation global observations of rain and snow. It centers on the deployment of a Core Observatory satellite carrying an advanced radar/radiometer system to measure precipitation from space. Data from the Core Observatory serve as a reference to unify precipitation measurements from a constellation of research and operational satellites.
Information from the GPM satellite constellation is combined and precipitation over the majority of the Earth's surface is estimated using the IMERG algorithm. There are three main types of IMERG products. The near real-time Early Run (IMERG-E) and Late Run (IMERG-L) have minimum latencies of 4 and 12 h, respectively. The post real-time Final Run (IMERG-F) has a minimum latency of 3.5 months.
In this study, the performance of the latest Version 06 release of IMERG products (IMERG-E, IMERG-L, IMERG-F) at the daily scale between 1 January 2001 and 31 December 2016 was evaluated. The products were downloaded from the Precipitation Data Directory (https://gpm.nasa.gov/data/directory (accessed on 18 March 2021)) using the wget tool. The difference between local time and UTC is + 8 h.

Materials and Methods
To reduce additional uncertainty caused by interpolation, we only extracted satellite data at the coordinates of the land stations [48]. It could be understood that the nearest grid IMERG data can be obtained according to the longitude and latitude of each station, i.e., 24 stations correspond to 24 different grids.
According to the universality of use and good performance, three continuous verification statistics and three categorical verification statistics are selected and shown in Table 2. The continuous verification statistics contain a correlation coefficient (CC), a root mean square error (RMSE), and a relative bias (RB) that were used to measure the accuracy of IMERG products [7,9,30]. CC indicates the degree of agreement between satellite data and station observations. RMSE describes the difference between satellite data and station observations. RB indicates systematic bias between satellite products and station observations. Three widely applied categorical verification statistics that describe the contingency of satellite precipitation estimates are the probability of detection (POD), a false alarm ratio (FAR), and a critical success index (CSI) [37,41]. POD measures the hit rate or the fraction of precipitation events detected correctly by satellite products. FAR denotes the fraction of the precipitation events indicated by satellite products that were actually nonevents. CSI describes the overall proportion of precipitation events correctly detected by satellite products. Values of categorical statistics range from 0 to 1. In this study, 1 mm/day was set as the threshold for precipitation events.   Note: n is the number of samples, i represents the ith sample, x denotes a precipitation estimate derived from satellite data, y denotes precipitation measured at the land station, x and y denote mean values of x and y, H represents the number of precipitation events that have been both observed and detected, M is the number of precipitation events that have been observed but undetected, and F represents the number of precipitation events not detected but observed.
The joint World Meteorological Organization Commission for Climatology (CCI), World Climate Research Programme (WCRP) project on Climate Variability and Predictability (CLIVAR) Expert Team on Climate Change Detection and Monitoring and Indices (ETCCDMI) has developed a series of indices to identify and quantify extreme precipitation events from daily rainfall data. Recent studies have described and analyzed many extreme precipitation indices, which have been applied for global and regional climate change studies [49][50][51][52].
Considering the precipitation characteristics in Jinan City, seven indices were selected for this study (Table 3). These indices are a subset of those 27 core indicators defined by the ETCCDMI. The indices of RX1day and R95p highlight the extreme precipitation events that can pose a great risk to society. SDII indicates the degree of precipitation intensity. Consecutive wet days (CWD) can be used to indicate flooding risk. R10, R20, and R50 can be used to describe the precipitation pattern of a year. Daily scale data for each year was used to calculate each extreme precipitation index. Moreover, to compare the results of observation data and satellite data, the multi-year average value of each index was calculated. Table 3. Extreme precipitation indicators.

RX1day
Maximum 1-day precipitation amount SDII Simple daily precipitation intensity index, precipitation per unit time CWD Maximum number of consecutive days of precipitation R10 Number of days with a precipitation amount more than 10 mm R20 Number of days with a precipitation amount more than 20 mm R50 Number of days with a precipitation amount more than 50 mm R95p Precipitation that is greater than the 95% percentile 4.  Figure 2 shows density scatterplots of daily rain gauge measurements over Jinan City and corresponding precipitation estimates from three IMERG products. Continuous verification statistics quantifying the accuracy of the IMERG products are also shown in the figure. The IMERG-F has the best performance. It has the highest CC (0.71) and the lowest RMSE (5.85 mm) and RB (10.39%). Conversely, IMERG-E has the lowest CC (0.64) and IMERG-L has the highest RB (28.67%). A positive RB represents an overestimate. Therefore, precipitation estimates from all three products exceed rain gauge measurements, with IMERG-E, IMERG-L, and IMERG-F estimates that are 125.83%, 128.67%, and 110.39% of rain gauge measurements, respectively. Table 4 shows the performance of IMERG products in detecting general precipitation events (daily precipitation amount < 20 mm) and heavy and extreme precipitation events (daily precipitation amount ≥ 20 mm). Performance of IMERG-F is slightly better than that of IMERG-E and IMERG-L even though differences between the products are small. For general precipitation events, performance of IMERG-F is superior. It has the highest average POD (0.949) and CSI (0.615) and the lowest average FAR (0.364). However, it has only the second highest POD for heavy and extreme precipitation events. For general precipitation events, POD values in all three products are acceptable and exceed 0.94. For heavy and extreme precipitation events, POD values are much lower and are at 0.518, 0.567, and 0.558 for IMERG-E, IMERG-L, and IMERG-F, respectively. The CSI values show that the ability of the IMERG products to successfully detect general precipitation events (52-61.5%) is considerably better than their ability to detect heavy and extreme precipitation events (31.4-36.8%). Furthermore, FAR values for general precipitation events are lower than those for heavy and extreme precipitation events. High FAR and low POD and CSI values indicate that the ability of IMERG products to detect heavy and extreme precipitation is still low. It may because IMERG products have better ability to preserve the general precipitation threshold. However, the ability to estimate heavy and an extreme rainfall threshold still needs to be improved. Table 4. Mean categorical verification statistics: probability of detection (POD), false alarm ratio (FAR), and critical success index (CSI) (Heavy and extreme precipitation represents daily precipitation amount greater than or equal to 20 mm).    Table 5 shows the average continuous verification statistics for IMERG precipitation estimates. "Monthly" indicates statistics for monthly estimates for each month between January 2001 and December 2016. "Flood season" indicates statistics for the flood season, i.e., from June to September. Performance of IMERG-F is better than that of IMERG-E and IMERG-L. Annually, IMERG-F has average CC, RMSE, and RB of 0.95, 25.12 mm, and 10.39%, respectively, based on monthly data. For the flood season, IMERG-F has the best performance of the three products and has the highest CC (0.91) and the lowest RMSE (23.34 mm) and RB (9.92%). It is interesting that IMERG products have lower CC, RMSE, and RB for the flood season. It may be due to the amount of data. More data would increase the correlation, but also the error. The IMERG-L has the highest RMSE (44.51 mm at a monthly scale and 38.47 mm for the flood season) and RB (28.67% for a monthly scale and 20.35% for the flood season) and has the worst performance of the three products. Generally, the IMERG-F has the best performance, especially for flood seasons. Figure 3 shows average monthly precipitation from rain gauges and all three IMERG products. Temporal variation of monthly precipitation is adequately captured by all IMERG products. Precipitation mainly occurs during the flood season, i.e., from June to September. Almost all IMERG products overestimate precipitation in all months, while IMERG-E slightly underestimates precipitation in April and June, and IMERG-L also slightly underestimates precipitation in April.

Spatial Differences between Satellite Precipitation Products
The spatial distribution of continuous and categorical verification statistics provides information on the accuracy of satellite precipitation products at different locations and contributes toward minimizing errors in hydrological studies where satellite products are used [53][54][55]. Based on daily scale data for each year, multi-year average values of these statistics of each station were calculated. Figure 5 shows spatial distributions of the continuous verification statistics for the three IMERG products. While values vary, spatial distributions share similar characteristics with high CC values in Southeastern Jinan (red points), and low RMSE values in Southwestern Jinan (green points) and high RB values in Eastern Jinan (red points). In order to compare the differences of satellite products, Zhou et al. [56] explained the proportion of stations with a CC value greater than 0.70 and RMSE value less than or equal to 5 mm. In this study, the CC and RMSE values were adjusted with 0.67 and 6.5 mm appropriately because of the relatively small number of gauges used in this study. For IMERG-E, 16.7% of the stations have CC exceeding 0.67. This percentage increases to 41.7% for IMERG-L and 100% for IMERG-F. For IMERG-E and IMERG-L, 25% of the 24 stations have RMSE below 6.5 mm. This percentage increases to 100% for IMERG-F.  Figure 6 shows spatial distributions of POD, FAR, and CSI over Jinan City. Spatial distributions of POD, FAR, and CSI are similar in all IMERG products. It could find that the distribution of POD is similar to values more than 0.9. For FAR, the green points (lower FAR) are mostly distributed in the southeast, for CSI, the red points (higher CSI) are also mostly distributed in the southeast. In general, all three products have similar POD and higher CSI and lower FAR, and, hence, better ability to detect precipitation over the mountainous region of Southeastern Jinan City. It is possible that the observed precipitation at a high elevation is relatively direct, while the precipitation at low elevation is more easily affected in the process of rain falling. However, further studies with more data are needed to explain this phenomenon. Overall, IMERG-F has the best detection ability with the highest average POD (0.949) and CSI (0.615) and the lowest average FAR (0.364).  Table 6 shows values of seven selected extreme precipitation indies derived from land station measurements and IMERG products. For RX1day, IMERG-E and IMERG-L estimates exceed the value derived from rain gauge data by 1.24 mm and 8.79 mm, respectively. The IMERG-F estimate is below the value derived from rain gauge data. For SDII, IMERG estimates are similar to the value derived from rain gauge data, indicating that IMERG products can be used to derive acceptable estimates of yearly precipitation intensity. For the number of consecutive wet days (CWD) and the number of days with high rainfall (R95p), estimates from IMERG products consistently exceed the value derived from station datasets. It may because IMERG products have overestimated the light precipitation events, which often occur at the beginning or the end of precipitation events. For R50, IMERG-L has the best performance of the three products. The IMERG-L R50 estimate is the closest to that derived from rain gauge data, with a difference of only 0.19 days. The IMERG-E and IMERG-F R50 estimates are both lower than the R50 value derived from rain gauge data. For R20 and R10, the performance of IMERG-F is superior. Differences between IMERG-F estimates and values derived from station measurements are 0.78 days (for R20) and 2.01 days (for R10), which are acceptable considering the total sample size of 5844 days.

Extreme Precipitation Indices
Since IMERG estimates of extreme precipitation indices have an acceptable level of accuracy, IMERG products can be used to derive extreme precipitation indices in areas without station data.

Probability Density Function of Precipitation Intensity
The probability density function (PDF) indicates the probability of the occurrence of a range of events and has been used in many studies to evaluate the quality of satellite precipitation products [57]. In this study, the PDFs were constructed from daily data.   For the categories of 0.1-1 and 40-50 mm/day, all IMERG estimates exceed the proportion derived from land station data. However, for the category of 0-0.1 mm/day, all IMERG estimates are below the proportion derived from land station data. It may be because the evaporation of raindrops before reached the ground, which lead to the overestimation of satellite products. For the category of more than 50 mm/day, IMERG estimates are similar to the proportion derived from land station data, with IMERG-E and IMERG-L estimates exceeding the proportion derived from land station data and the IMERG-F estimate being less than the proportion derived from land station data.

Discussion
The assessment of IMERG products over Jinan City indicates that the long-latency IMERG-F product generally performs better than the short-latency IMERG-E and IMERG-L products at all temporal scales. This difference is mainly attributed to quasi-Lagrangian time interpolation, high-quality rain gauge data, and a climatological adjustment. The assessment of continuous verification statistics shows that the long-latency IMERG-F product has higher correlation coefficients and lower relative errors and root mean square errors, indicating that calibration of satellite products with rain gauge data can increase the accuracy of satellite-derived precipitation estimates. Compared with the short-latency products, the long-latency IMERG-F product also has an improved ability in precipitation detection. It has lower rates of falsely reporting non-events as precipitation events and missing reporting actual precipitation events.
Different climatic and topographic conditions may lead to different spatial and temporal distributions of precipitation in Jinan City [46]. Previous studies have established the influence of elevation on satellite precipitation estimates [58]. To explore the spatial differences, the performance of IMERG products in Jinan City as a function of elevation was also evaluated (Figure 8). Of the three products, the performance of IMERG-F is generally superior at all elevations. Of the three continuous verification statistics, CC and RMSE increase with elevation, even though the increase in CC is small. Conversely, RB decreases rapidly with increasing elevation. Of the three categorical verification statistics, there are small variations in the performance of the three IMERG products with increasing elevation. On the whole, the long-latency IMERG-F product has the highest POD and CSI and the lowest FAR at any elevation, and, thus, the best precipitation detection ability. However, POD and FAR decrease and CSI increase with growing elevation, indicating the need for improvement.
IMERG datasets showed overestimation of daily precipitation. It may be explained by the precipitation intensity assessment. IMERG products generally overestimate precipitation, which might have something to do with the algorithm. Estimates of extreme precipitation indices have an acceptable accuracy. This indicates that IMERG products could be used for extreme precipitation indices. While this study has shown that IMERG products, especially IMERG-F, can provide high-resolution precipitation estimates for research, considerable bias at some stations could be found. Furthermore, the precision of high-resolution satellite-derived precipitation estimates can be improved by taking into account topographic effects using downscaling techniques [59][60][61].
In addition, event-based rainstorm data are used frequently as input in urban rainstorm waterlogging simulation models (e.g., SWMM, MIKE URBAN, and Infoworks ICM). IMERG products also have precipitation data at a half-hour scale. Therefore, future studies should pay attention to the accuracy of IMERG precipitation products at sub-daily scales [62]. Moreover, error component models should be used to assess sources of errors [63].

Conclusions
The latest IMERG V06 dataset fuses the early precipitation estimation collected during the operation of the TRMM satellite (2000-2015) with more recent precipitation estimation collected during operation of the GPM satellite (2014-present). Using daily precipitation data from the rain gauge networks in Jinan City from 2001 to 2016, which are not used by IMERG, a comprehensive assessment of IMERG products was conducted at daily, monthly, and yearly scales. Three continuous verification statistics (CC, RMSE, and RB) and three categorical verification statistics (POD, FAR, and CSI) were used to assess the accuracy and detection ability of the IMERG precipitation products. On the basis of the results, major conclusions are summarized as follows.
1. With high CC and low RMSE and RB, IMERG-F performs better than IMERG-E and IMERG-L at all temporal scales (daily, monthly, and yearly). At the daily scale, the ability of satellite products to detect general precipitation is clearly superior to the ability to detect heavy and extreme precipitation. Nevertheless, all IMERG products could adequately capture monthly precipitation trends and distributions.
2. There is considerable spatial variability in the performance of the three IMERG products. Values for CC and RMSE are the highest in Southeastern Jinan City, while RMSE is relatively low in Southwestern Jinan City. For all IMERG products, their ability to detect precipitation is superior over the mountainous region of Southeastern Jinan City. However, these results need further study to explain.
3. For seven extreme precipitation indices, differences between IMERG estimates and values derived from land station measurements are acceptable, indicating that the characteristics of the indices are adequately captured by all IMERG products (Table 5). For IMERG-F, R20 and R10 estimates are closer to values derived from land station measurements, indicating that IMERG-F is superior at detecting moderate precipitation events.
4. Categorical verification statistics indicate that three IMERG precipitation products perform better at high elevations. At all elevations, the performance of IMERG-F is better than that of IMERG-E and IMERG-L. However, POD and FAR decrease and CSI increase with the growth of elevation, indicating the need for improvement.