Evaluation the Performance of Several Gridded Precipitation Products over the Highland Region of Yemen for Water Resources Management

Management of water resources under climate change is one of the most challenging tasks in many arid and semiarid regions. A major challenge in countries, such as Yemen, is the lack of sufficient and long-term climate data required to drive hydrological models for better management of water resources. In this study, we evaluated the accuracy of accessible satellite and reanalysis-based precipitation products against observed data from Al Mahwit governorate (highland region, Yemen) during 1998–2007. Here, we evaluated the accuracy of the Climate Hazards Group Infrared Precipitation with Station (CHIRPS) data, National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR), Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), Tropical Rainfall Measuring Mission (TRMM 3B42), Unified Gauge-Based Analysis of Global Daily Precipitation (CPC), and European Atmospheric Reanalysis (ERA-5). The evaluation was performed on daily, monthly, and annual time steps by directly comparing the data from each single station with the data from the nearest grid box for each product. At a daily timescale, CHIRPS captures the daily rainfall characteristics best, such as the number of wet days, with average deviation from wet durations around 11.53%. TRMM 3B42 is the second-best performing product for a daily estimate with an average deviation of around 34.7%. However, CFSR (85.3%) and PERSIANN-CDR (103%) and ERA-5 (−81.13%) show an overestimation and underestimation of wet days and do not reflect rainfall variability of the study area. Moreover, CHIRPS is the most accurate gridded product on a monthly basis with high correlation and lower bias. The average monthly correlation between the observed and CHIRPS, TRMM 3B42, PERSIANN-CDR, CPC, ERA-5, and CFSR is 0.78, 0.56, 0.53, 0.15, 0.20, and 0.51, respectively. The average monthly bias is −2.9, −5.25, 7.35, −25.29, −24.96, and 16.68 mm for CHIRPS, TRMM 3B42, PERSIANN-CDR, CPC, ERA-5, and CFSR, respectively. CHIRPS displays the spatial distribution of annual rainfall pattern well with percent bias (Pbias) of around −8.68% at the five validation points, whereas TRMM 3B42, PERSIANN-CDR, and CFSR show a deviation of greater than 15.30, 22.90, and 66.21%, respectively. CPC and ERA-5 show Pbias of about −88.6% from observed data. Overall, in absence of better data, CHIRPS data can be used for hydrological and climate change studies on the highland region of Yemen where precipitation is often episodical and measurement records are spatially and temporally limited.

performance (e.g., correlation) against gauge observations in different regions and on different terrain topographies. CHIRPS, for example, has shown high correlation with lower bias in East Africa, Egypt, and Iran [13][14][15]. The TRMM 3B42 dataset was evaluated by several studies, such as in China [16,17], and is recommended for hydrological modeling in the upper Indus basin of Pakistan [18]. The performance of PERSIANN-CDR in eastern China was significant [19], and shows good correlation in West Africa [20]. CFSR precipitation data exhibited very good performance in three basins in Turkey [21], but shows some deviation from observations, as in the case study of the Logone catchment, Chad [22]. ERA-5 displayed good performance over North America [23], while CPC was evaluated over the mountainous region of Africa [24] and proved to be in relatively good agreement with in situ data, but with variation across the Nile basin area [25].
These products, however, have not been evaluated in Yemen and cannot be directly used in hydro climate studies until their accuracy is assessed. Hence, the current study, for the first time in Yemen, assesses and evaluates the accuracy of several daily precipitation products against the available observations from the highland region of Yemen (Al Mahwit governorate), employing the most widely and accepted statistical methods at different timescales. Findings of the study are expected to overcome the data limitation and identify the most accurate gridded product for hydrological and regional and local-scale climate modeling, such as the Statistical Downscaling Model (SDSM) [26], allowing for assessment of future impacts of climate change on water availability and development of adaptation measures in Yemen.

Region of Interest
This study focuses on the evaluation of several precipitation products on a daily, monthly, and annual time step over the Al Mahwit governorate in Yemen ( Figure 1). Al Mahwit occupies an area of around 2800 km 2 and is located within the highland region of Yemen, the most fertile part of the Arabian Peninsula [27]. It is one of the most extensively terraced areas in the world, and the major producer of the country's food crops [28,29]. Rainfall in Al Mahwit is influenced by two main mechanisms: the Red Sea Convergence Zone (RSCZ) and the monsoonal Intertropical Convergence Zone (ITCZ) [30]. The RSCZ is most visible in the western part of the country and active through March to May, and to some extent during autumn season. The ITCZ reaches Yemen in July-September, moving north and then south again so that its influence lasts longer in the south [31]. Both the RSCZ and the ITCZ generate rainfall in convectional storms of high intensity and limited duration and scope, but the ITCZ storms have a larger areal extent than those of the RSCZ [32]. Environmental factors such as complex topography and the heterogeneity of land surface affect the climate of the area for temperature and precipitation [33], while land use and runoff are in turn tied to climate and morphology.

Datasets
In Al Mahwit governorate, ten stations from high and low altitude areas provided by the National Water Resources Authority (NWRA) in Yemen were used to perform this evaluation (Table 1). Five stations with ≥10 years of daily data (Kokaban, Al Rojom, Al Mahwit city, Al Khamees, and Al Khabt) are the main employed stations to compare the six precipitation products with observations. Kokaban, Al Rojom, and Al Mahwit city stations are located in the high complex mountainous areas, while Al Khamees and Al Khabt provide rainfall rates present in the low areas ( Figure 1). Selection of these stations is based on the availability of the best and most complete observations at a daily timescale with the longest time period through the period 1998-2007. The other five stations with complete monthly records <10 years were used to verify the highest correlated product with ground observations. In this study, we used all of the available stations and it was impossible to find other stations with long-term and complete data suitable for evaluation. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.

Satellite-Based Product
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 23

Datasets
In Al Mahwit governorate, ten stations from high and low altitude areas provided by the National Water Resources Authority (NWRA) in Yemen were used to perform this evaluation (Table  1). Five stations with ≥10 years of daily data (Kokaban, Al Rojom, Al Mahwit city, Al Khamees, and Al Khabt) are the main employed stations to compare the six precipitation products with observations. Kokaban, Al Rojom, and Al Mahwit city stations are located in the high complex mountainous areas, while Al Khamees and Al Khabt provide rainfall rates present in the low areas ( Figure 1). Selection of these stations is based on the availability of the best and most complete observations at a daily timescale with the longest time period through the period 1998-2007. The other five stations with complete monthly records <10 years were used to verify the highest correlated product with ground observations. In this study, we used all of the available stations and it was impossible to find other stations with long-term and complete data suitable for evaluation. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.


Tropical Rainfall Measuring Mission (TRMM 3B42) The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration Tropical Rainfall Measuring Mission (TRMM 3B42) The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25 • spatial resolution and cover 50 • N to 50 • S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration algorithm (3B-42) uses an optimal combination of 2B-31, 2A-12, SSMI, AMSR, and AMSU precipitation estimates (referred to as HQ) to adjust IR estimates Remote Sens. 2020, 12, 2984 5 of 23 from geostationary IR observations. Near-global estimates are made by calibrating the IR brightness temperatures to the HQ estimates. The estimates of TRMM 3B42 are scaled to resemble the rain gauge monthly analyses used in TRMM 3B43. The output is daily rainfall for 0.25 × 0.25 degree grid boxes [40]. TRMM precipitation data are widely used in studies covering tropical region countries. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.


Tropical Rainfall Measuring Mission (TRMM 3B42) The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR) Estimates of precipitation are produced using the PERSIANN algorithm on GridSat B1 infrared satellite data, and the National Centers for Environmental Prediction (NCEP) stage IV hourly precipitation data are used for training the artificial neural network [19,41]. The adjustment of PERSIANN-CDR is done using the Global Precipitation Climatology Project (GPCP) monthly product version 2.2 (GPCPv2.2). Therefore, the PERSIANN-CDR monthly means degraded to 2.5-degree resolution correspond to GPCPv2.2. PERSIANN CDR is defined by the National Research Council (NRC) as a time series of measurements with sufficient duration, consistency, and continuity to determine climate variation and change. The key points of PERSIANN-CDR are its persistent, a long-term dataset of more than 30 years of records, quarterly updates, and its use of many different data sources, which makes PERSIANN-CDR more reliable with its high resolution (0.25) monthly precipitation information that is consistent with GPCP monthly estimates. However, PERSIANN-CDR heavily relies on infrared data, which means conversion from IR to precipitation rate needs a complex algorithm. Also, this product has daily temporal resolution, does not resolve the diurnal cycle, and may not record some short-lived, intense events [42].

Reanalysis Data
mountainous areas, while Al Khamees and Al Khabt provide rainfall rates present in the low areas ( Figure 1). Selection of these stations is based on the availability of the best and most complete observations at a daily timescale with the longest time period through the period 1998-2007. The other five stations with complete monthly records <10 years were used to verify the highest correlated product with ground observations. In this study, we used all of the available stations and it was impossible to find other stations with long-term and complete data suitable for evaluation. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.


Tropical Rainfall Measuring Mission (TRMM 3B42) The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration

National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR)
The NCEP Climate Forecast System Reanalysis (CFSR) is a complete dataset for the 31 year period from 1979 to 2009. The CFSR was built as a global, high resolution, coupled atmosphere-ocean-land-surface-sea-ice system to provide the best estimate of the state of these coupled domains over this 31-year period [43,44]. The CFSR has also been extended as an operational, real-time product into the future. The new features of the CFSR include: (1) coupling of atmosphere and ocean during production of the 6-h guess field, (2) sea ice as an interactive model, and (3) satellite radiances assimilation by the Grid-Point Statistical Interpolation (GSI) scheme over the entire period [45]. The resolution of the CFSR global atmosphere is approximately 38 km (T382) with 64 levels extending from the surface to 0.26 hPa. The latitude spacing of the global ocean is 0.25 deg at the Equator, extending to a global 0.5 deg beyond the tropics, with 40 levels to a depth of 4737 m. The CFSR atmospheric model has observed changes of carbon dioxide (CO 2 ) over the 1979-2009 period, together with variations in aerosols and other trace gases and solar variations. Most available in situ and satellite observation data were included in the CFSR. Bias of satellite-based radiance observations in CFSR were corrected with spin-up runs at full resolution, considering variable CO 2 concentrations [46]. The CFSR output products are available at daily and an hourly time resolution at a 0.5 × 0.5 deg latitude and longitude resolution.

Datasets
In Al Mahwit governorate, ten stations from high and low altitude areas provided by the National Water Resources Authority (NWRA) in Yemen were used to perform this evaluation (Table  1). Five stations with ≥10 years of daily data (Kokaban, Al Rojom, Al Mahwit city, Al Khamees, and Al Khabt) are the main employed stations to compare the six precipitation products with observations. Kokaban, Al Rojom, and Al Mahwit city stations are located in the high complex mountainous areas, while Al Khamees and Al Khabt provide rainfall rates present in the low areas ( Figure 1). Selection of these stations is based on the availability of the best and most complete observations at a daily timescale with the longest time period through the period 1998-2007. The other five stations with complete monthly records <10 years were used to verify the highest correlated product with ground observations. In this study, we used all of the available stations and it was impossible to find other stations with long-term and complete data suitable for evaluation. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.


Tropical Rainfall Measuring Mission (TRMM 3B42) The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring

ERA-5 Atmospheric Reanalysis
ERA-5 is the fifth-generation reanalysis from the European Centre for Medium-Range Weather Forecasts (ECMWF), and the latest climate reanalysis produced by ECMWF [47], providing hourly data on many atmospheric, land-surface, and sea-state parameters together with estimates of uncertainty [48,49]. It provides special advancements compared to ERA-Interim, which was replaced by ERA-5 on 31 August 2019. The analysis by ERA-5 is produced at a one-hour time step using a significantly more advanced 4Dvar integration scheme. The horizontal resolution of ERA-5 is approximately 30 km, and it computes atmospheric variables at 139 pressure levels [50]. ERA-5 data are available in the Climate Data Store on regular latitude-longitude grids at 0.25 • × 0.25 • resolution, with atmospheric parameters on 37 pressure levels. For this work, we used ERA-5 hourly data on single levels from 1979 to present. Total precipitation data were downloaded from the hourly timescale and aggregated to the daily time step. However, the total precipitation is available in equivalent m units, where conversion into mm is needed to compare data with ground observations. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.

Gauge-Based Data
The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration CPC Unified Gauge-Based Analysis of Global Daily Precipitation A gauge-based analysis of daily precipitation has been built over the global land areas. Gauge reports from around 30,000 stations were collected from several sources including COOP, GTS, and other national and international agencies [51,52]. Quality control was performed through comparison with historical records and separate information from measurements at near stations, concurrent radar/satellite observations, and from numerical model forecasts. Reports from quality controlled stations are then interpolated to generate analyzed fields of daily precipitation with consideration of orographic effects [53]. The daily analysis is constructed on a 0.125 degree lat/long grid over the entire global land areas, and released on a 0.5 degree lat/long grid over the global domain for a period from 1979 to the present. This dataset has two elements: (a) the "retrospective version", which uses 30K stations and spans 1979-2005, and (b) the "real-time version" which uses 17K stations and spans 2006-present. The real-time data will be reprocessed in the future to be consistent with the retrospective analysis [54]. In this study, comparison between CPC precipitation estimates and observations is performed on a monthly basis due to the unavailability of daily data by this dataset in the study area.
( Figure 1). Selection of these stations is based on the availability of the best and most complete observations at a daily timescale with the longest time period through the period 1998-2007. The other five stations with complete monthly records <10 years were used to verify the highest correlated product with ground observations. In this study, we used all of the available stations and it was impossible to find other stations with long-term and complete data suitable for evaluation. On the other hand, the rainfall datasets used in this evaluation are the latest updated gridded products with high temporal and spatial resolution. Criteria to select the precipitation datasets are based on the availability of the product at a daily timescale, special coverage of the highland region of Yemen, and their application and previous use in neighboring regions with similar topography, such as Ethiopia, Iran, and southern Saudi Arabia [34][35][36]. In this research, we considered three types of products to be compared against ground stations: satellite-based, reanalysis, and ground-based data.
The Tropical Rainfall Measuring Mission (TRMM) is a joint space mission between NASA and Japan's National Space Development Agency, designed to monitor and study tropical and subtropical precipitation and the associated release of energy [37]. Five instruments are used by the mission: TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Precipitation Radar (PR), Lightning Imaging Sensor (LSI), and Clouds and Earth's Radiant Energy System (CERES) [38]. The PR and TMI are the instruments mainly used for precipitation. These instruments are used in an algorithm that composes the TRMM combined instrument (TCI), calibration dataset (TRMM 2B31) for the TRMM Multi-Satellite Precipitation Analysis (TMPA), whose TMPA 3B43 monthly rainfall rates and TMPA 3B42 daily and sub daily (3 h) averages are very likely the most appropriate TRMM-related products for climate research. TRMM 3B42 and TRMM 3B43 are available in 0.25° spatial resolution and cover 50° N to 50° S for 1998-present [39]. The TRMM 3B43 dataset contains the output of the TRMM Algorithm 3B42, which is used to generate Tropical Rainfall Measuring Mission (TRMM) that comprises blended high quality (HQ)/infrared (IR) precipitation and root-mean-square (RMS) precipitation error estimates. The combined instrument rain calibration

Climate Hazards Center Infrared Precipitation with Station (CHIRPS) data
Climate Hazards Center Infrared Precipitation with Station (CHIRPS) data is a 30-year quasi-global rainfall dataset. Spanning 50 • S-50 • N (and all longitudes), starting in 1981 to near-present, CHIRPS embeds 0.05 • resolution satellite images with on-site station data to produce gridded rainfall time series for trend analysis and seasonal drought monitoring [55]. Station data are used to produce a preliminary 2-day rainfall product by blending data from sparsely located GTS. Gauges with rainfall estimates retrieved from the cold cloud duration (CCD) at every pentad [56]. The product provides improved daily, pentanal, decadal, and monthly data at a 0.05 spatial resolution. Due to the high quality of the products, it has been used from several hydroclimate studies worldwide [56][57][58][59]. General properties of employed and examined datasets are shown in Table 2.

Quality of Ground Stations
In Yemen, not only are the meteorological stations limited, but also accessibility to climate data is very restricted and only provided by a few national agencies that mostly ask for high fees to make data available for research and public use. In addition, quality of climate data is not ensured, and there are often gaps in the data. For this study, it was only possible to get daily precipitation data with a reasonable temporal and special coverage from stations belonging to the National Water Resources Authority (NWRA) of Yemen. However, more than 27 stations from different governorates in the highland region of Yemen including Sana'a, Thamar, Amran, Hajjah, and Taiz governorate were intensively checked, but most stations were excluded because of missing data; only 19 stations were found suitable for comparison with precipitation products. Quality check of ground observation involved using methods such as double mass curve [60] and by checking the percentage of daily missing data. Generally, we used stations with complete daily records (no missing values) to compare the performance of the precipitation products during the period 1998-2007. Stations with, at least, five years of complete monthly records from Al Mahwit and nearby governorates were used to evaluate the performance of the highest correlated product (CHIRPS) in some other areas within the highland region of Yemen.

Comparing Ground Observations with Precipitation Products
For comparison between observations and the different precipitation products (satellite, reanalysis, gauge-based data), we directly compared data of a point (station) to the nearest pixel estimate by the product, and by taking area grid cell average to station average over the study area. However, when comparing daily rainfall estimates, it was challenging to find a strong agreement between products and observations. Therefore, it was more practical to evaluate the products on their ability to produce an accurate wet duration and rainfall intensity during the period 1998-2007. Point to nearest pixel estimate is a direct comparison between station data with the nearest estimate of product within the same grid cell area. Grid cell average to station average is the average precipitation estimate by the product to overall average of stations. At a daily timescale, statistical analysis such as standard deviation, mean, skewness, and daily maximum rainfall rate for each product were calculated. In addition, the Probability of Detection (POD) and False Alarm Ratio (FAR) were also used to assess the ability of the products to accurately detect daily rainfall events. In monthly and annual evaluation, the most commonly used statistical methods, such as correlation coefficient (r), bias (BIAS), root-mean-square error (RMSE), deviation, and percentage deviation between products and observations, were applied to examine the agreement of individual product (P) to station data (O).
The correlation coefficient, denoted by r, represents a measure of the strength of the linear relationship between two variables [61]. The correlation coefficient (r) (Equation (1)) is used to evaluate the agreement of individual products (P) to station data (O). A value of r close to 1 shows a perfect confident fit between the products and station data. However, the correlation coefficient is sometimes referred as CC.
Bias (Equations (2) and (3)) refers to the direction of a measurement process to over or underestimate the value of a population parameter [62]. Bias can be negative (underestimation) or positive (overestimation) based on accuracy of each product (P).
RMSE (Equation (4)) is a quadratic rating rule that measures the average magnitude of the error. It is the square root of the average of squared differences between prediction and real observation [63].
Percent Error (Equation (5)) is applied to compare the experimental quantity, which is referred to as the precipitation product, P, with the theoretical quantity, the observation, O, which is considered the "correct" value. The percent error is the absolute value of the difference divided by the "correct" value, times 100.
The POD (Equation (6)) ranges from 0 to 1, where 1 is the perfect and the optimal score. FAR (Equation (7)) lies between 0 and 1, where 0 is the desired result. These measures (Table 3) are widely used [64][65][66] and described in detail by Wilks [67]. A threshold of 0.5 mm was used to distinguish between rainfall and no rainfall events. This was based on the rainfall pattern of the study area, which occurs in flash intensive rain; any value less than this threshold might be considered as a noise. In addition, Taylor diagrams (Taylor, 2001) were used to briefly summarize the statistical relationship between ground observations and products. Taylor diagrams provide a method of graphically overviewing how closely a pattern (or a set of patterns) agrees with observations. The agreement between two patterns is quantified in terms of their correlation, their centered root-mean-square difference, and the magnitude of their variations (represented by their standard deviations). The diagram is useful in evaluating multiple aspects of complex models or in gauging the relative skill of many different models [68].

Examination of Rainfall Daily Estimates
Several histograms are constructed to exhibit the number of wet days and the intensity of rainfall by the different precipitation products over the period 1998-2007. Based on ground stations, the highest rainfall rate at a daily scale is 100 mm, and the lowest is 0.5 mm day −1 . Therefore, the histograms plots are built within this limit to compare between gridded products and ground observations with a threshold of 0.5 mm day −1 . Days lower than 0.5 mm were excluded from this comparison. Starting with 0.5 mm threshold, the precipitation products are examined from an equal basis point and discrepancy and variance between precipitation products is reduced, especially with the existence and repetition of small rainfall events between 0-0.5 mm, in all precipitation products. For the Kokaban station, which is located in the high altitude area within Al Mahwit (2650 m above sea level), CFSR and PERSIANN-CDR produce a higher number of wet days compared to observations. According to CFSR and PERSIANN-CDR, the total number of wet days are 1168 and 964, respectively, while observations only showed 471 days in the study area. Therefore, both products (PERSIANN-CDR and CFSR) overestimate wet events, particularly the light rainfall (<5 mm), which occurred more frequently. In addition, high estimation of wet events are also found for the Al Rojom, and Al Mahwit city stations, which are located in the high complex mountainous territories within Al Mahwit. In low areas of Al Mahwit (Al Khamees and Al Khabt stations), performance of CFSR is better in reproducing wet days than in the high complex areas. The deviation of wet days between CFSR and ground stations in Al Khamees and Al Khabt is −18.4 and −4.7%, respectively. This may reveal the good performance of CFSR reanalysis in low areas rather than in high and complex mountainous areas. On the other hand, the average deviation between TRMM 3B42 and observations is around 14% in the high areas (Kokaban, Al Mahwit city, Al Rojom) compared to 45% in the low areas (Al Khamees, and Al Khabt). TRMM 3B42 might be better to produce good estimates of wet durations in high complex mountainous areas rather than in low areas. ERA-5 reanalysis product demonstrates a lower number of wet days in all stations with average deviation from observation around −81.13%. The closest product to reproduce the exact and close wet durations is CHIRPS. In the complex high areas (Kokaban, Al Rojom, Mahwit city), CHIRPS' average difference of wet days from ground stations is around 6.7%, while in Al Khamees and Al Khabt the deviation is 9.2 and 11.5%. Figure 2 shows the occurrence of wet days by the different precipitation products.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 23 basis point and discrepancy and variance between precipitation products is reduced, especially with the existence and repetition of small rainfall events between 0-0.5 mm, in all precipitation products. For the Kokaban station, which is located in the high altitude area within Al Mahwit (2650 m above sea level), CFSR and PERSIANN-CDR produce a higher number of wet days compared to observations. According to CFSR and PERSIANN-CDR, the total number of wet days are 1168 and 964, respectively, while observations only showed 471 days in the study area. Therefore, both products (PERSIANN-CDR and CFSR) overestimate wet events, particularly the light rainfall (<5 mm), which occurred more frequently. In addition, high estimation of wet events are also found for the Al Rojom, and Al Mahwit city stations, which are located in the high complex mountainous territories within Al Mahwit. In low areas of Al Mahwit (Al Khamees and Al Khabt stations), performance of CFSR is better in reproducing wet days than in the high complex areas. The deviation of wet days between CFSR and ground stations in Al Khamees and Al Khabt is −18.4 and −4.7%, respectively. This may reveal the good performance of CFSR reanalysis in low areas rather than in high and complex mountainous areas. On the other hand, the average deviation between TRMM 3B42 and observations is around 14% in the high areas (Kokaban, Al Mahwit city, Al Rojom) compared to 45% in the low areas (Al Khamees, and Al Khabt). TRMM 3B42 might be better to produce good estimates of wet durations in high complex mountainous areas rather than in low areas. ERA-5 reanalysis product demonstrates a lower number of wet days in all stations with average deviation from observation around −81.13%. The closest product to reproduce the exact and close wet durations is CHIRPS. In the complex high areas (Kokaban, Al Rojom, Mahwit city), CHIRPS' average difference of wet days from ground stations is around 6.7%, while in Al Khamees and Al Khabt the deviation is 9.2 and 11.5%. Figure 2 shows the occurrence of wet days by the different precipitation products.  In addition to counting wet days, quantitative statistical analysis such as skewness, standard deviation, mean, and variance between the precipitation products and ground observations are performed. Higher mean and standard deviation are observed on the ground observations, almost in all stations, while examined products skewed towards low precipitation and lower standard deviation (Table 4). However, CFSR shows higher standard deviation. The average of maximum daily rainfall produced by the ground observations is around 61 mm day −1 . This rate of rainfall is underestimated by the precipitation products, except CFSR which, in fact, overestimated precipitation rates in all stations. Categorical statistics of the daily precipitation data are shown in Table 5. The value of Probability of Detection (POD), which is close to 1, proves high accuracy, while False Alarm Ratio (FAR) values are better if they are close to 0. As seen in Table 5, CHIRPS has higher average of POD with value around 0.77, while the FAR is 0.36. The average value of POD and FAR by PERSIANN-CDR is 0.71 and 0.68. POD average value by TRMM 3B42 is 0.52, while its average value of FAR is 0.71. Average values of POD and FAR by CFSR and ERA-5 are 0.48, 0.77 and 0.09, 0.76, respectively. However, we found that the ability of the different products to reflect POD is not influenced by elevation. This result is consistent with findings on different topographic conditions in Mexico [69], but does not agree with the same study in terms of FAR, which shows no tendency to decrease in low elevation areas. Overall, none of the investigated products (satellite, reanalysis, gauge-based) fully reflects variability of rainfall rates in the study area. Heavy rainfall events are underestimated or totally missed, and rainfall events >25 mm day −1 are rarely captured by the precipitation products. Furthermore, the number of low rainfall events is repeated and frequently reported by the products. Therefore, no attempts were made to investigate extreme events, especially under the short period of observation that constrains visibility of extremes in the study area.

Monthly Evaluation
The agreement between the different rainfall products and observations is examined using a set of commonly used statistical estimators, including correlation coefficient (r), bias (BIAS) and root-mean-square error (RMSE) (method details are discussed in Section 3.2). Monthly total rainfall by each product is compared to observations in each individual ground station over the period 1998-2007.
The result shows that CHIRPS is the most accurate rainfall product with high correlation and low bias and error. TRMM 3B42 is the second-best product. CFSR, ERA-5 reanalysis, PERSIANN-CDR, and CPC show lower correlation with higher error and bias. The highest correlation of CHIRPS with station data is found in Al Khamees district (r = 0.98) followed by Rojom (r = 0.87). In Kokaban, Mahwit city, and Khabt stations, CHIRPS shows correlation 0.75, 0.74, and 0.65, respectively.
In the high altitude areas, TRMM 3B42 shows an average correlation of around 0.58, while in the low land areas (Al Khamees, and Al Khabt), the average correlation between TRMM 3B42 and ground data is 0.54. In high mountainous areas, CFSR, PERSIANN-CDR, ERA-5, and CPC show an average correlation of 0.46, 0.55, 0.23, and 0.28, respectively, while in the low areas, the correlation of CFSR with observation increased to around 0.61. However, CPC and ERA-5 show a low correlation with observed data.
Among examined datasets, CFSR shows the highest root-mean-square error (RMSE), especially in Kokaban with value around 70.0 mm, and Mahwit city at 73.0 mm. PERSIANN-CDR, ERA-5, and CPC are the second products showing high RMSE, with values around 44.0, 53.0 and 52.0 mm, respectively. In the mountainous areas, specifically in Kokaban, CFSR and PERSIANN-CDR tend to produce more estimates of rain than real observations, while ERA-5 and CPC underestimate real precipitation events. The highest bias between precipitation products and observations is found with CFSR, followed by CPC and ERA-5 (  Moreover, the monthly values are plotted in a Taylor diagram (Figure 4). In general, CHIRPS correlates well in all stations with lower standard deviation in high mountainous areas and a slightly higher standard deviation in the low areas (Al Khamees and Al Khabt). For the Kokaban station, CHIRPS correlation is around 0.75 with a standard deviation of about 0.25, which is lower than the normalized standard deviation. In Al Rojom and Mahwit city, CHIRPS shows a high correlation of 0.87 and 0.72, respectively. The standard deviation of CHIRPS in Al Rojom and Al Mahwit city is lower than the observed. This may due to misrepresentation of heavy rain events by CHIRPS, which affects the standard deviation. In the low areas (Al Khamees and Al Khabt stations), the standard deviation of CHIRPS is higher than the normalized standard deviation by around 0.15. Overall, the average correlation of CHIRPS over the study area is around 0.78.   TRMM 3B42 shows good correlation with observations and seen to be the second-best product after CHIRPS. The correlation coefficient of TRMM 3B42 is around 0.57 in the highland areas and around 0.54 in the low areas, while its standard deviation tends to decrease in the high areas compared to low areas. Again, this is due to underestimation and its capability for capturing high rainfall events compared to ground observations. PERSIANN-CDR shows rather good correlation with the ground dataset of Al Rojom and Al Khamees stations with 0.75 and 0.81, respectively.
However, in the other three stations (Kokaban, Mahwit city, and Al Khabt); PERSIANN-CDR demonstrates low correlation owing to the higher estimation of total monthly precipitation rates by this product in the study area. CFSR shows high standard deviation. CFSR, on the other hand, overestimates monthly rainfall rates in all stations, with many more estimates for the Kokaban station. Over the study area, the monthly correlation average of CFSR is about 0.51. CPC and ERA-5 show a weak correlation and very low standard deviation when compared to the observed data ( Figure 4). This is due to the low estimates of precipitation rates in this area by both products.
The average correlation of TRMM 3B42 in the five stations is around 0.56. PERSIANN-CDR, CFSR, ERA-5, and CPC show an average correlation 0.53, 0.51, 0.20, 0.15, respectively. The root-mean-square error average (RMSE) is found to be higher by CFSR especially for the Kokaban station, with values around 69.6 mm and bias of 50.2 mm. In Kokaban, CFSR is producing high estimates of rainfall compared to real observations. In general, the average RMSE in all stations by CHIRPS, TRMM 3B42, PERSIANN-CDR, CFSR, ERA-5, and CPC is found to be 25.5, 35.5, 34.2, 60.4, 45.7, and 45.5 mm, respectively. (Table 6) The average of monthly total rainfall rates of the precipitation products over the study area is used to calculate the percentage difference between each product and the observations. The results show that the percentage difference between PERSIANN-CDR and CFSR from observations is 46.5 and 81.25%, respectively. Similarly, ERA-5 and CPC show a difference of around −158.2 and −148.2% from observation, whereas the difference between CHIRPS and TRMM 3B42 with observed data is −24.07 and 32.32%, respectively ( Figure 5).
Further, to make the study as comprehensive as possible, additional monthly rainfall data from 14 stations around the study area are used. The additional data are particularly used to verify the performance of the highest correlated products (i.e., CHIRPS). As this additional dataset is temporally limited, compared to daily stations used in this study, the comparison was only done with CHIRPS. Similar to the daily comparison, the result shows a high correlation and low bias of CHIRPS (Table 7) for all 14 stations. On average, the correlation coefficient (CC), Bias, and RMSE are 0.88, 2.905 mm, and 40.1 mm, respectively. Overall, the findings are similar to the daily evaluation, which shows a high correlation to CHIRPS.

Annual Timescale Evaluation
Rainfall maps are constructed to display the variability of annual total precipitation presented by the different products. Maps cover the entire study area with the annual total rainfall rates during the period 1998-2007. Based on ground observations (Figure 6 Further, to make the study as comprehensive as possible, additional monthly rainfall data from 14 stations around the study area are used. The additional data are particularly used to verify the performance of the highest correlated products (i.e., CHIRPS). As this additional dataset is temporally limited, compared to daily stations used in this study, the comparison was only done with CHIRPS. Similar to the daily comparison, the result shows a high correlation and low bias of CHIRPS (Table 7) for all 14 stations. On average, the correlation coefficient (CC), Bias, and RMSE are 0.88, 2.905 mm, and 40.1 mm, respectively. Overall, the findings are similar to the daily evaluation, which shows a high correlation to CHIRPS.   in Al Mahwit city and Al Rojom stations with annual total precipitation around 410 mm year −1 at both stations. Northeast of Al Mahwit (Kokaban station), the rainfall annual rates are 346.4 mm year −1 . Kokaban station represents the rainfall rates in the highest part of the region (altitude around 2600 m above sea level). The lowest rainfall rates are observed in the low areas within Al Mahwit in the Al Khamees and Al Khabt districts, with an altitude around 400 m above sea level. Around 263.6 mm year −1 is the average annual precipitation at the Al Khamees and Al Khabt stations. Annual precipitation estimates on the study area differ from one product to other. However, the tested products ( Figure 7) show similar tendency to produce high rainfall estimates northeast of Al Mahwit (high areas) and lower rates in the west and the southwest of the area (low areas). This pattern of rainfall is apparently influenced by the topography of the region. The total estimates of the annual rainfall rates are captured well by CHIRPS, followed by TRMM 3B42. CHIRPS is more accurate and shows the exact pattern of rainfall in the entire area. Parallel to the observations ( Figure  6), CHIRPS produces high precipitation rates in Al Mahwit city and Al Rojom, but provides the best estimate for Kokaban station. TRMM 3B42 shows good relative estimates for several stations (Al Rojom and Al Khamees) but fails to produce the rainfall patterns of the study area. The average annual rainfall estimates by CHIRPS in the high complex mountainous territories (Kokaban, Al Rojom, Al Mahwit city) is 351.7 mm year −1 . In the low area stations (Al Khamees and Al Khabt), the CHIRPS rainfall estimate is around 207.1 mm year −1 . However, CHIRPS appears to slightly underestimate rainfall rates of the study area. Taking the average annual rainfall, the Pbias of CHIRPS in all stations is −8.68%. TRMM 3B42, on the other hand, overestimates the annual rainfall rates in Kokaban, Al Khabt, and Al Rojom, and underestimates for Al Mahwit city and Al Khamees. CFSR and PERSIANN-CDR show the highest estimate of rainfall rates for all stations. In Kokaban, CFSR shows the highest rainfall estimate, with average rates around 937.2 mm year −1 , while for Al Annual precipitation estimates on the study area differ from one product to other. However, the tested products ( Figure 7) show similar tendency to produce high rainfall estimates northeast of Al Mahwit (high areas) and lower rates in the west and the southwest of the area (low areas). This pattern of rainfall is apparently influenced by the topography of the region. The total estimates of the annual rainfall rates are captured well by CHIRPS, followed by TRMM 3B42. CHIRPS is more accurate and shows the exact pattern of rainfall in the entire area. Parallel to the observations (Figure 6), CHIRPS produces high precipitation rates in Al Mahwit city and Al Rojom, but provides the best estimate for Kokaban station. TRMM 3B42 shows good relative estimates for several stations (Al Rojom and Al Khamees) but fails to produce the rainfall patterns of the study area. The average annual rainfall estimates by CHIRPS in the high complex mountainous territories (Kokaban, Al Rojom, Al Mahwit city) is 351.7 mm year −1 . In the low area stations (Al Khamees and Al Khabt), the CHIRPS rainfall estimate is around 207.1 mm year −1 . However, CHIRPS appears to slightly underestimate rainfall rates of the study area. Taking the average annual rainfall, the Pbias of CHIRPS in all stations is −8.68%. TRMM 3B42, on the other hand, overestimates the annual rainfall rates in Kokaban, Al Khabt, and Al Rojom, and underestimates for Al Mahwit city and Al Khamees. CFSR and PERSIANN-CDR show the highest estimate of rainfall rates for all stations. In Kokaban, CFSR shows the highest rainfall estimate, with average rates around 937.2 mm year −1 , while for Al Rojom, Mahwit city, Al Khamees, and Al Khabt stations, the rainfall average estimation by CFSR is 431.2 mm year −1 . The percent bias (Pbias) between CFSR and observation is 66.21%. PERSIANN-CDR is the second product showing a high estimate of rainfall rates. In the high mountainous areas (Kokaban, Al Rojom, and Mahwit city), the average of annual rainfall estimates by PERSIANN-CDR is around 481.5 mm year −1 and around 275.0 mm year −1 in the low areas (Al Khamees and Al Khabt); the Pbias is 22.9%.
CPC and ERA-5 underestimate the annual precipitation rates of the study area. Around 39 and 38.2 mm year −1 is the average of annual rainfall estimate in the high areas by CPC and ERA-5, respectively, while in the low land areas (Al Khamees and Al Khabt) this value drops to 31.8 and 27.9 mm year −1 . The Pbias of average annual precipitation of CPC and ERA-5 and ground stations is −88.35 and −88.97%, respectively. However, due to the large variation of rainfall estimates between the products, it was not practical to show the maps at the same rainfall scale, as some rainfall estimates will only produce one color scale map (Figure 7). The observed average annual rainfall from ground stations and precipitation products are shown in Table 8.
Rojom, Mahwit city, Al Khamees, and Al Khabt stations, the rainfall average estimation by CFSR is 431.2 mm year −1 . The percent bias (Pbias) between CFSR and observation is 66.21%. PERSIANN-CDR is the second product showing a high estimate of rainfall rates. In the high mountainous areas (Kokaban, Al Rojom, and Mahwit city), the average of annual rainfall estimates by PERSIANN-CDR is around 481.5 mm year −1 and around 275.0 mm year −1 in the low areas (Al Khamees and Al Khabt); the Pbias is 22.9%. CPC and ERA-5 underestimate the annual precipitation rates of the study area. Around 39 and 38.2 mm year −1 is the average of annual rainfall estimate in the high areas by CPC and ERA-5, respectively, while in the low land areas (Al Khamees and Al Khabt) this value drops to 31.8 and 27.9 mm year −1 . The Pbias of average annual precipitation of CPC and ERA-5 and ground stations is −88.35 and −88.97%, respectively. However, due to the large variation of rainfall estimates between the products, it was not practical to show the maps at the same rainfall scale, as some rainfall estimates will only produce one color scale map (Figure 7). The observed average annual rainfall from ground stations and precipitation products are shown in Table 8.

Discussion
Finding a gridded precipitation product that generates the same rainfall estimate as ground observations, particularly at daily timescale, is challenging in the highland region of Yemen due to complex topography, low number of ground stations, and rainfall pattern that is typified by heavy and short rainfall events [30,70,71], which are often missed by gridded precipitation products such as remote sensing products [72]. Therefore, the agreement between the products and station data on a daily timescale is rather weak and improved when temporal resolution is decreased to monthly and yearly. This result agrees with other studies [73][74][75]. However, comparison between precipitation products with ground records at a monthly and annual timescale is recommended by several studies [51,76] since field-based stations (as point measurements) cannot be considered as reference data for the assessment of area-based rainfall estimates, if not compared at a monthly or annual time step.
Based on our results, CHIRPS captures the daily wet durations well with an average deviation around 11.53% from ground observations, correlates well at monthly timescale with value of 0.78, captures the annual rainfall rates with Pbias around −8.68%, and reproduces the rainfall patterns of the study area. TRMM 3B42, as the second-best performing product, shows a deviation of around 43.7% for wet days and demonstrates a relatively high monthly correlation 0.56, with annual Pbias of around 15.30%. Compared to CHIRPS, however, TRMM 3B42 shows lower skills in providing the rainfall patterns of the study area. Also, TRMM 3B42 tends to overestimate daily rainfall rates over the study area, whereas CHIRPS slightly underestimates the daily rainfall estimates. Other products such as CFSR and PERSIANN-CDR demonstrate overestimation of daily precipitation rates in all stations with higher biases and errors compared to CHIRPS and TRMM 3B42. The lowest precipitation estimate is found by CPC and ERA-5 products on all timescales and in all validation areas. Table 9 shows the result of all applied statistics used to investigate the performance of the precipitation products at different timescales. Although investigation of the specific reasons for the difference between rainfall products and observations are beyond the scope of this study, over-and underestimation of rainfall estimates may relate to factors such as large special resolution, low number of ground stations in the study area, errors related to data source and merging and blending process, model parameterization applied specifically for reanalysis products, and the satellite sensor used to differentiate between rain and rainless clouds [77,78].
For instance, the high estimation of rainfall rates by TRMM 3B42 is consistent with other studies on neighboring regions such as Ethiopia [79], which indicated overestimation of rainfall rates on the highland region of Ethiopia. High rainfall rates produced by PERSIANN-CDR, particularly the low rainfall events, is consistent with other studies from Iran highlands and high region of China [41,80]. A large degree of variability of reanalysis dataset, mainly CFSR, agrees with the result presented by Qiaohong Sun and C. Maio 2017 [81], which covers around 30 global precipitation datasets and mentions the large difference in precipitation estimates, especially in complex mountain areas and high latitude regions. A large variability has been found in complex mountainous terrain in Ethiopian highlands [43] and the Ecuador Andes [82]. The CPS product, which underestimates rainfall rates in our study area, has been shown to underestimate monthly rainfall rates in other semiarid regions [83,84] and misses local heavy precipitation events. In Yemen, limited ground stations and the nature of the study area may affect the performance of products like CPC and ERA-5 to predict the rainfall rates correctly. However, the reanalysis precipitation products are considered to have more uncertainty than the analyzed state fields [13,85,86].
In general, the high performance of CHIRPS in the highland region of Yemen (Al Mahwit) may be attributed to its high resolution (0.05 • ) and low latency to blind station data within two days and produce a final product with an average latency of about 3 weeks [55,85]. High performance of CHIRPS has been proven in areas where station data are not included [13,86]. In addition, the product has been used in many evaluation studies [87,88] and recommended to support hydrological forecasts and trends analysis, for instance, as proved in an Ethiopian case study [89,90] and other studies that use CHIRPS as input in hydro climate modeling [58,90]. Table 10 below shows the results of CHIRPS performance in several previous studies. The studies [13][14][15]90,91] were performed within semiarid regions that are generally similar to topographical conditions of Yemen's highland region, and influenced by the nearly the same climate and rainfall patterns.

Summary and Conclusions
Ground observations are essential for studying the impact of climate change, as well for hydrological studies at regional and local level. In many countries, such as Yemen, long-term observations are very short, and in most cases there is a large gap in ground data due to discontinuous collection of data and poor documentation by local agencies. However, we note some administration factors affecting data collection in Yemen, such as the replacement of the chairman at the top of the authority, and the low budget allocated for maintenance and operation of meteorological stations. Furthermore, data collected in Yemen are not available for public use due to data sharing policy and fees imposed by local agencies like the meteorological service and National Water Resources Authority (NWRA).
This study presents the first attempt to evaluate rainfall estimates by different precipitation products against the available station data on the highland region of Yemen. Comparison between the products and observations was performed at daily, monthly, and annual timescales by commonly used statistical and categorical analysis, such as correlation coefficient, root-mean-square error, bias, deviation, probability of detection, false alarm ratio, and frequent occurrence of wet days. At a daily timescale, CHIRPS, followed by TRMM 3B42, provides a better agreement with the occurrence of wet days compared to CFSR, PERSIANN-CDR, and ERA-5. CFSR and PERSIANN-CDR showed a larger overestimation of wet events, while ERA-5 gave a lower estimation of daily rainfall events. Compared to CHIRPS and TRMM 3B42, CFSR, PERSIANN-CDR, ERA-5, and CPC showed lower correlation at monthly scale with higher bias and errors. In addition, maps presenting annual rainfall show the high performance of CHIRPS in producing and matching rainfall patterns over the study area.
Overall, CHIRPS, with its high spatial resolution (0.05), shows the highest performance with observations at all timescales and in all stations. Since CHIRPS is available at a daily timescale and for longer periods, and shows lower bias and error, this product can be used for climate studies such as regional downscaling and as input for hydrological models in regions with sparse ground stations and limited data, such as the highlands of Yemen.