Understanding the Present-Day Spatiotemporal Variability of Precipitable Water Vapor over Ethiopia: A Comparative Study between ERA5 and GPS

: Atmospheric water vapor plays a crucial role in atmospheric, climate change, meteorological, and hydrological processes. In a country like Ethiopia, with its complex topography and synoptic-scale spatiotemporal circulation patterns, the analysis of the spatiotemporal variability of precipitable water vapor (PWV) is very challenging, and is hampered by the lack of long observational datasets. In this study, we process the PWV over eight Ethiopian global positioning system (GPS) sites and one close to the Ethiopian eastern border, for the available common period 2013–2020, and compare with the PWV retrieved from the state-of-the-art ERA5 reanalysis. Both PWV datasets agree very well at our sample, with correlation coefﬁcients between 0.96 and 0.99, GPS-PWV show a moderate wet bias compared to ERA5-PWV for the majority of the sites, and an overall root mean square error of 3.4 mm. Seasonal and diurnal cycles are also well captured by these datasets. The seasonal variations of PWV and precipitation at the sites agree very well. Maximum diurnal PWV amplitudes are observed for stations near water bodies or dense vegetation, such as Arbaminch (ARMI) and Bahir Dar (BDMT). At those stations, the PWV behavior at heavy rainfall events has been investigated and an average 25% increase (resp. decrease) from 12 h before (resp. 12 h after) the start of the rainfall event, when the PWV peaks, has been observed.


Introduction
Water vapor is a very important constituent of the atmosphere, as it contributes strongly to the atmospheric energy budget by transporting moisture and energy as latent heat through the troposphere and lower stratosphere. Being the most important natural greenhouse gas, water vapor also affects the atmospheric radiation and circulation, and is therefore critical in the analysis and prediction of short and long-term changes in the global climate [1]. As a consequence, a good representation of water vapor concentrations in climate models is essential [2]. Water vapor is also the source of clouds and precipitation, and therefore of crucial importance in many meteorological and hydrological processes. In any column of air, the amount of water vapor, or precipitable water vapor (PWV), provides an upper limit to the potential precipitation which could fall from that column of air [1]. In particular, many studies have confirmed that PWV variations can effectively reveal the occurrences and the life cycles of precipitation events [3][4][5][6].
The amount of atmospheric water vapor is highly variable, both in space and time [7]. It can be measured from ground-based, in-situ and by space-based instruments. Groundbased global positioning system (GPS) receiver networks have proved to be reliable for estimating PWV with root mean square error (RMSE) values of 1-3 mm [8,9]. Previous Depending on their topography, Ethiopian regions have different climate vulnerability. The lowlands are vulnerable to increased temperatures and prolonged droughts that may affect livestock rearing. The highlands generally suffer from intense and irregular rainfall, leading to erosion, which together with higher temperatures may result in lower agricultural production. This combined with an increasing population and conflicts, may lead to greater food insecurity in some areas [19]. Because of its complex orography, the prevalence of synoptic convective rainfall and synoptic-scale spatiotemporal circulation patterns, Ethiopia is a challenging region and regional climate models (as opposed to global circulation models) are necessary in order to provide a reasonable representation of atmospheric water and rainfall for climate-change projections [20,22]. Prior to the climatechange impact assessment, however, the reliability of climate models is typically assessed by model validation [2]. Thereby reanalysis data is mostly used as reference. However, since reanalysis data has a strong underlying modeling component, the question can be posed to what extend these can be trusted in such difficult region. Moreover, the accuracy of the data assimilation used in these reanalyses relies strongly on the availability and quality of the observational input [16].
Because of the relatively low cost and limited maintenance compared to other groundbased devices like radiosondes and microwave radiometers [1,9,14], measuring PWV with ground-based GPS stations is gaining importance in Ethiopia. In particular, the East African tropical region has also recently benefited from the Africa-Array (AA), which has created a multidisciplinary research network for the broader Earth science community, by installing continuous GPS instruments fitted with meteorological sensors [23]. However, the Ethiopian GPS network remains sparse, has incomplete spatial coverage, with a poor representativeness in mountainous areas, suffers from data continuity, and has a lot of missing values. For instance, only two Ethiopian GPS stations, (Addis-Ababa or ADIS and Arbaminch town or ARMI) have a long time series of data, starting in 2007. Other Ethiopian stations whose PWV data have been described in earlier studies [24,25] have been decommissioned in 2011 (with the data no longer available); they have been set up along the two opposite sides of Ethiopian Great Rift Valley to study the geodynamical processes of this region, with the possibility to retrieve PWV from the data being only a by-product.
In this study, we used a network of GPS sites to process the PWV retrievals at eight Ethiopian GPS sites and one nearby site at Djibouti, close to the eastern border of Ethiopia, for their common observation period of 2013-2017/2020. Those sites are located in distinct geographical regions; see Figure 1a. We compare the PWV time series at those sites with the PWV output from the most recent European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis model, ERA5 [26], at the site locations. In particular, we compare the PWV seasonal and diurnal cycles between both datasets, but also between the different sites to assess their spatial distributions. Our study differs from previous PWV comparisons at Ethiopia involving GPS by the considered time period (2007-2011 in [24,25]), the used subset of Ethiopian GPS sites [24,25,27], and the use of previous versions of ECMWF reanalysis models in [24,25,27]. Those analyses concluded that the GPS PWV had a dry bias compared to ERA-Interim [28] over Ethiopian lowlands, and a wet bias over the highlands. Moreover, although the ERA5 temperature and precipitation fields have been already evaluated over Ethiopia [16,29], the precipitable water vapor, closely related with them, has not so far.
Given its wide spatial and temporal availability, ERA5 is a very valuable dataset to study the spatiotemporal variability of PWV [30]. The reliability and accuracy of ERA-PWV has already been comprehensively evaluated at a global [31] and European [30] scale. However, in regions with few observations or complex terrain such as Ethiopia, ERA5-PWV might be severely biased [16]. GPS-PWV, which is not assimilated in ERA5, has the advantage of high local accuracy, but is more limited in space and time compared to ERA5-PWV. Therefore, in this paper, we investigate the spatial and temporal PWV variability of ERA5 over Ethiopia, a country with a very complex orography and different spatiotemporal climate patterns. We therefore address the following questions: (1) Are there potential systematic PWV biases that depend on the orography? (2) Does the increased spatial and temporal resolution of ERA5 with respect to its predecessor ERA-Interim amount to a closer agreement with GPS in terms of the PWV seasonal and diurnal cycle? Finally, given the high spatial and temporal resolution of the GPS-PWV dataset, we explore a potential relationship between GPS-PWV and severe rainfall events at two Ethiopian sites in areas that are sensitive to flooding.
The paper is organized as follows. In Section 2, we describe the different datasets used and the processing that is needed to retrieve PWV. In Section 3, we validate our GPS processing by comparing with the processing carried out at the International GNSS Service (IGS), and with the radiosonde observations at Addis Ababa. Then, we compare the PWV at the GPS stations with ERA5-PWV in terms of biases, seasonal and diurnal cycle in Section 4. An application of the GPS-PWV time series in predicting severe rainfall events is illustrated in Section 5. Finally, conclusions are drawn in Section 6.

Datasets and Processing Strategies
The following datasets are used in this study: GPS observations, surface meteorological measurements, ERA5 model output [26], the International GNSS Service (IGS) troposphere product [32], and radiosonde data. They are described below.

GPS Observations and GAMIT-PWV
In this study, to obtain PWV at the eight Ethiopian dual-frequency geodetic GPS receivers over the period 2013-2017/2020, Figure 1a, we used a network composed of these eight stations and of 26 surrounding IGS stations, see Figure 1b. Unfortunately, four out of the eight Ethiopian GPS stations (ARMI, DEBK, BDMT, and SHIS) suffer from significant observation gaps (see Table 1) and no Ethiopian station was available during the study period in the eastern part of the country (only during 2007-2011, see [24]). To compensate for this, we used the neighboring IGS station Djibouti (DJIG) instead. As already mentioned in the introduction, more GPS stations have been available in Ethiopia before 2011, and two of the GPS stations used here (ADIS and ARMI) have an extended time series since 2007 [24,25]. However, the common data period 2013-2017/2020 is used for the analysis in this paper. Although spatially limited, our GPS network covers different geographical areas with different climatological conditions, as also outlined in Table 1. GPS data were processed using the GNSS at Massachusetts Institute of Technology (GAMIT) software v10.71 [33,34] in ionosphere-free double-difference mode to estimate the ZTDs, from which the PWVs are derived. IGS stations are included in the data processing (1) to enhance the absolute tropospheric estimation by introducing long baselines between GPS stations [11], and (2) to assess the precision of our GAMIT-ZTDs. The main processing options are summarized in Table 2. The time resolution of the GAMIT-ZTDs is 2 h. The troposphere causes an excess delay in the propagation of the GPS signals as well as a bending of its path in the atmosphere. The delay ∆L along the slant path(s) between the GPS satellite and the ground receiving antenna can be expressed as: where N h and N w represent the hydrostatic and wet refractivity, respectively. Equation (1) can be decomposed into hydrostatic, wet, and horizontal gradient delay components [40] as follows: where ε and α are respectively the elevation angle and the azimuth. Each component of the total delay is usually considered as the delay in the zenith direction (the zenith hydrostatic delay (ZHD) and the zenith wet delay (ZWD)) and the first-order north-south and eastwest gradients terms G N and G e [41][42][43] multiplied by the hydrostatic and wet mapping functions m h (ε) and m w (ε) (e.g., [44][45][46]). A mapping function represents the ratio of the excess path delay at the elevation angle ε to the path delay in the zenith direction. The hydrostatic mapping function accounts for the bending effect. The sum of the ZHD and ZWD is called zenith total delay (ZTD):

PWV Retrieval from GPS Observations
In GPS data processing, ZTD is estimated and ZHD is modelled to access the ZWD. ZHD can be modelled using e.g., the Saastamoinen [47] formula, which is a function of surface pressure P s : where f(ϕ, H) accounts for the variations in the gravitational acceleration at the station with latitude ϕ and height H above a reference ellipsoid: The ZWD can then be converted to PWV using the formula [48]: Following [49], the conversion factor Q is calculated using measurements of weighted mean temperature T m as: where ρ v is the density of water and R v is the specific gas constant of water vapor (461.5 J.kg −1 k −1 ). In this study, we used T m and P s estimates at the site locations from the empirical global pressure and temperature model GPT2w [39], computed from ECMWF ERA-Interim temperature and humidity profiles.

Radiosonde Data
Radiosonde data are often used to validate GPS-PWV, as they provide accurate vertical profile measurements of water vapor pressure (humidity) and temperature, from which PWV can be calculated. In this study, we used high-resolution sounding data at Addis Ababa station covering the period 2013-2020 to validate our GAMIT-PWV retrievals at this site. Addis Ababa radiosonde station, the only available radiosonde station in Ethiopia, is located at a horizontal separation less than 4 km and an elevation difference less than 80 m from the ADIS IGS station.
The radiosonde sounding data were collected from the Integrated Global Radiosonde Archive (IGRA) [50] and have a temporal resolution of 12 h. The data quality control strategy that we implemented is the same as in [51]. Radiosonde temperature and humidity profiles are required to reach at least 300 hPa for the top level and have data available at the surface and at least five (resp. four) standard pressure levels above the surface for stations below (resp. above) 1000 hPa. In addition, profiles with large gaps (i.e., larger than 200 hPa) in pressure between consecutive records of temperature or humidity are rejected.
The method described in [52] is used for the (small) altitude offset correction between GPS and radiosonde station. The hydrostatic and ideal gas equations are used to adjust radiosonde pressure to the GPS station height as described in [53]. Finally, PWV from the GPS antenna level P 0 to the top of the radiosonde records is calculated by integration over height as described in the following equation (PWV in units of kg/m 2 or mm): where q(P), the specific humidity as a function of atmospheric pressure P, is given in units of kg/kg and g 0 = 9.80665 m/s 2 .

ERA5 Reanalysis Data
ERA5 is the fifth generation ECMWF atmospheric reanalysis of the global climate [26]. Compared with the previous reanalysis generation (ERA-Interim, [28]), ERA5 reanalysis improves both the spatial resolution from~80 km to~31 km (0.25 • × 0.25 • ) and the temporal resolution from 6 h to 1 h [54]. In this study, we extracted 1 h temporal resolution of total column precipitable water vapor (TCWV) from the ERA5 surface grid at the GPS station locations. Because GPS antenna heights and reanalysis model surface heights are not identical (see height differences ∆h in Table 3), the ERA5-PWV estimates were adjusted for the height difference ∆h based on an empirical formulation derived by [55]: ∆PWV = −4 × 10 −4 × PWV × ∆h, with ∆h in meters. The height differences between the GPS stations and surface heights in the reanalysis range from −468.88 m to +1218 m (negative values indicate that the GPS station is below ERA5 surface grid). Table 3. GAMIT-PWV compared to ERA5-PWV (ERA5-PWV used as reference). The different columns denote the number of ERA5-PWV and GAMIT-PWV observation pairs, the mean differences between GAMIT-PWV and ERA5-PWV (GAMIT-PWV minus ERA5-PWV) in mm, the Root Mean Square Errors (RMSE) in mm, the correlation coefficient R, the altitude of the GPS station in m, and the height difference ∆h in m between GPS receiver antenna height and ERA5 model grid surface height. The comparisons between the two datasets are made at 2 h resolution over the period 2013-2020, when available.

IGS Tropospheric Product
The IGS produces its troposphere product dataset [32] that includes ZTDs and horizontal gradients for all IGS stations. This product has a temporal resolution of 5 min and a nominal accuracy of~4 mm [32]. In this study, we used this dataset, referred hereafter as IGS-ZTD, to validate the tropospheric delay outputs from our own processing strategy (see Section 3.1) at the 26 IGS sites surrounding Ethiopia (including DJIG) over the period 2013-2020 (see Table 4). Additionally, this product is also available for the Ethiopian station ADIS (Addis-Ababa), as it is also an IGS station.

Surface Meteorological Measurements
Rainfall rate and surface temperature data for six Ethiopian stations (ADIS, ABOO, NEGE, ASOS, BDMT, and ARMI) for the periods 2013-2019 were collected (see Table 4).
The surface temperature and rainfall records are used in this study to understand the spatiotemporal variations of the PWV in Ethiopia. The rainfall data will also enable us to investigate a possible link between GAMIT-PWV and heavy rain (see Section 5).

Validation of GPS Processing
In this section, we shortly validate our GPS processing strategy by comparing the calculated ZTDs of our network with the ZTDs processed by the IGS on common GPS sites (Section 3.1) and by studying the correlation of our GAMIT-PWV with radiosonde PWVs at Addis-Ababa (Section 3.2).

Validation of the GAMIT-ZTD Based on the IGS-ZTD
We first validated our GPS data processing by comparing the GAMIT-ZTD and the IGS-ZTD at the 27 IGS stations used in our network for the time span 2013-2020. In this comparison, we only retain the 18 IGS stations that have observation data above 50% of the processing time span. IGS ZTD products at 5-min sampling rate are averaged to 2 h intervals to be directly comparable with the GAMIT-ZTD estimates in this study. The results show a good agreement with biases between GAMIT-ZTD and IGS-ZTD ranging from −2.76 mm to 1.83 mm (IGS-ZTD taken as reference), RMSE ranging from 2.46 mm to 4.79 mm, and correlation coefficients between 0.93 to 1.0. These values are in agreement with the nominal precision of the IGS-ZTD announced by IGS central Bureau (~4 mm).

Validation of GAMIT-PWV with Radiosonde Observations
We compared GAMIT-PWV to radiosonde-PWV at Addis Ababa for the period 2013-2020. We obtained a high correlation (0.98), a small wet GAMIT-PWV bias of 0.51 mm w.r.t. radiosonde PWV and a RMSE of 2.71 mm ( Figure 2). Analyzing the GPS and radiosonde-PWV observations at the same station, but for a different time period (2007-2011), Mengistu et al. [25] found a wet bias of 3.3 mm (RMSE 4.3 mm) between the GPS-PWV and the 12:00 UTC radiosonde observations, when the GPT surface pressure was used for the ZTD to PWV conversion, but only a bias of 0.1 mm (RMSE 2.5 mm) when making use of the observed surface pressure at the site. The better agreement between both datasets in our study might be explained by the different GPS processing, the use of the GPT2w surface pressure for the ZTD to PWV conversion, and including radiosonde observations at 00:00 UTC, which might impact the comparison with GPS-PWV retrievals as well, due to the known radiation dry bias in radiosonde daytime humidity observations (e.g., [56]).

Spatiotemporal PWV Variability in Ethiopia from GPS and ERA5 PWV
In this section, we assess to which extent both GAMIT-PWV and ERA5-PWV represent the spatiotemporal variability of PWV in a country with a very complex orography as Ethiopia. Table 3 shows the mean differences, RMSE, and correlation coefficients between the GAMIT-PWV and ERA5-PWV datasets at the eight Ethiopian GPS stations, plus DJIG (Djibouti), during the period 2013-2020. To calculate these statistical parameters, we time matched the ERA5-PWVs to the 2-h time resolution of GAMIT-PWV by averaging the 1-h sampled ERA5-PWVs to 2 h intervals. A first thing to note is that the two datasets highly correlate, with correlation coefficients ranging between 0.96-0.99. Mean PWV differences between both datasets are in the range of −0.  Table 3). The correction method that we applied to account for these height differences, seems therefore not very successful for these sites.

Comparison of GAMIT-PWV and ERA5-PWV over Ethiopia
The mean differences and RMSE found here between GAMIT-PWV and ERA5-PWV are lower than those reported in [25] for the period 2007-2011 between GAMIT-PWV and ERA-Interim-PWV for the common three stations in both studies. In particular, the biases (resp. RMSE) of GAMIT-PWV w.r.t. ERA5-PWV for ADIS, ARMI and BDMT stations are respectively −0.04 mm, 0.08 mm and 1.25 mm (0.62 mm, 0.57 mm and 2.05 mm), whereas the biases (resp. RMSE) of GAMIT-PWV w.r.t. ERA-interim for these stations were −4.55, 3.78 and 4.47 mm (5.7, 5.6 and 6.3 mm) in [25]. The better agreement in our study might be ascribed to (1) the higher spatiotemporal resolution of ERA5 compared to ERA-Interim, (2) improved data assimilation and physical representation of processes in ERA5, (3) the use of GPT2w surface pressure and weighted mean temperature to convert ZTD to PWV, as [25] used the older GPT model [57] based on the ERA40 reanalysis.
The earlier analyses [24,25] concluded that the GAMIT-PWV had a dry bias compared to ERA-Interim over Ethiopian lowlands, and a wet bias over the highlands. With respect to ERA5, we do not observe such a clear relation between bias and the station's altitude. However, other factors such as the location of GPS stations near water bodies and related representativeness differences with gridded datasets like reanalyses may contribute to discrepancies [24]. For instance, Bahir Dar (BDMT), a site in the Ethiopian northern highlands, is located near Lake Tana on its southward side, the largest highland lake in the region. The high evaporation rate from the lake may be responsible for the relatively high value of GAMIT-PWV record with respect to ERA-interim [25] and ERA5 (our study). Those reanalysis gridded datasets with lower spatial (horizontal) resolution might not capture satisfactorily the higher evaporation rates due to the presence of a lake and related impact on the PWV field at the GPS site. The same explanation might be given for Djibouti (DJIG), located near the Red Sea and the Gulf of Aden.

Seasonal Cycle
We studied the seasonal cycle of the atmospheric water vapor over Ethiopia based on GAMIT-PWV and ERA5-PWV datasets, with the precipitation and surface temperature time series as auxiliary tools. As the distribution of rainfall in the country depends both on the regional topography and on the seasonal variation of the atmospheric circulation [58], Ethiopia can be divided in three main regions based on the seasonal rainfall cycle: (i) northern and central-western areas with one rainy season that peaks in July/August; (ii) the southern part with two seasons of short rainfall (September-November) and long rainfall (March-May) respectively; and (iii) the eastern and central parts with two rainy periods called spring (February-May) and summer (June-September) rainy seasons. We will use this classification in discussing the seasonal PWV cycle as well. For stations in the northern part of the country (BDMT, DEBK and SHIS), and for the station ASOS in the western part of the country, PWV peaks during the months May/June-August, which corresponds to the summer season. An example is given in Figure 3a with station BDMT. During this season, the air flow is dominated by a zone of convergence in low-pressure systems accompanied by the oscillatory Inter-Tropical Convergence Zone (ITCZ) extending from West Africa towards India through Ethiopia or north of it [59], bringing a sufficient amount of PWV and rainfall in the northern and western part of Ethiopia. On the other hand, the lowest values of PWV in these parts of the country are observed during the months December-February, corresponding to the driest season for these areas, with little or no precipitation. During this season, Ethiopia is predominantly influenced by dry air masses originated from the Saharan Anticyclone as well as cool and dry air masses originated from the Siberian and Arabian Anticyclones [59]. In the central part of the country (stations ADIS and ABOO, see Figure 3b for ADIS), the lowest PWVs are recorded during the months February-May and the highest values during the months June-July. In this region of the country, a regime of a mono-modal summer rainfall cycle (June-September) arises, with peak rainfall especially in July/August, which is also reflected in the PWV seasonal variation of those two sites.
In the southern part of the country (stations NEGE and ARMI, see Figure 3c for ARMI), the highest PWV values are recorded once during the months September-November and once during the period March-May. The south of Ethiopia undergoes short rainfalls in September-November and long rainfalls in March-May. The (double) PWV peaks have a clear link with two precipitation periods. The seasonal cycle of both rainfall and PWV is strongly created by north-south movements of the ITCZ across the region [23].
The seasonal cycle is also captured at the station DJIG, see Figure 3d, which is located to the east of Ethiopia. This station exhibits a peak of PWV around August and the lowest PWV values in December. In terms of rainfall, August is known to be the wet season (summer season) and December is the driest month in this region [60]. Unfortunately, we lack rainfall data for this site during the time period of our study.
In general, Figure 3 shows that both the ERA5-PWV and GAMIT-PWV seasonal variations have nearly identical patterns, in particular for the ARMI and ADIS stations. The seasonal variations of both datasets for the stations BDMT and DJIG are also very similar, but the PWV monthly mean values are biased between both datasets, with this bias being non-constant over the months. As shown in Table 5, both datasets reveal very similar amplitudes of the seasonal PWV variation at the sites. This is compliant with the finding that ERA5 is able to capture the seasonal cycles of temperature (Figure 4 and [16]) and rainfall in Ethiopia [16,29] reasonably well, even if the peak in temperature is shifted by a month (Figure 4), and despite the difference in rainfall amounts (wet bias of ERA5 compared to observations). From Figure 4, it can be noted that the temperature variation within a year is rather low for the Ethiopian sites, with a maximum difference between the monthly mean temperatures around 5 • C. Overall, the summer (monsoon) seasons are wet, quite cloudy with often afternoon thunderstorms occurring and therefore characterized by relatively low temperatures compared to the other seasons. The winter and spring seasons are dry and warm, with the month of May often being the hottest month of the year. Fall has moderate temperatures.  We analyzed the year-to-year variability of the seasonal cycle of PWV using the standard deviation (STDV) of the monthly anomalies. The highest PWV inter-annual variability in terms of STDV is observed for DJIG station (with a peak of 6.24 mm in April). Given its particular location near the Red Sea and the Gulf of Aden, the transport of moisture to this region is sensitive to a large inter-annual variability, which is also illustrated by its irregular rain pattern [60]. Larger standard deviations are also obtained for the stations located in the southern part of Ethiopia (ARMI and NEGE), ranging from 2.80 mm to 4.9 mm, during the driest winter and early spring season (November to April). This large year-to-year PWV variability might be associated to the large year-to-year rainfall variability of winter rains over south-eastern Ethiopia, dominated by large scale changes in the Indian Ocean and its coupled atmosphere with a clear link to the Indian Ocean Dipole (IOD). Winter rainfall over south-eastern parts of Ethiopia is increased during positive IOD events [15,59]. On the other hand, stations located in the central and southern parts of the country (ADIS, ABOO, NEGE, and ARMI) show low PWV monthly anomalies STDV during summer season, ranging from 0.4 mm to 1.18 mm. The small year-to-year variability during the summer season for this region is due to the rain pattern being during the wet season almost similar from year to year and from station to station (see the small STDV of the rainfall amounts in Figure 3). From Figure 4, we can see that the inter-annual variability of temperature is low for most of stations, especially during the summer season.

Diurnal Cycle
Many processes can induce diurnal variations in atmospheric water vapor. These include e.g., (1) surface evapotranspiration, which peaks around noon [61], (2) atmospheric large-scale vertical motion, which tends to be downward from late morning to afternoon and upward from midnight to early morning [62], (3) atmospheric low-level moisture convergence and precipitation, which occurs more frequently around midnight [62].
As we have the GAMIT-PWV datasets available at 2-h temporal resolution and the ERA5-PWV at 1 h resolution, we can investigate the presence of a diurnal PWV cycle at our Ethiopian sites. Mengistu et al. [25] and Yehun et al. [27] already pointed out the presence of a diurnal PWV cycle in some Ethiopian sites (including ADIS, ARMI, DEBK, and BDMT). We found that the amplitude of the diurnal PWV cycles varies from 0.99 mm (ADIS) to 4.74 mm (ARMI), with ARMI, BDMT and DJIG stations having the strongest diurnal cycle with mean amplitude between 3 and 5 mm ( Figure 5 and Table 5), which corresponds tõ 15% in terms of relative amplitude (Table 5). These three sites are located near water bodies and dense vegetation. BDMT is located near Lake Tana; ARMI is surrounded by many water bodies (e.g., Lake Chamo and Lake Abiyata) and near dense vegetation, whereas DJIG is located near the Red Sea. For these three stations surface evapotranspiration is the main factor driving the diurnal variations. The stations ADIS, ABOO, ASOS, and NEGE have much lower diurnal cycles, less than about 1.5 mm (less than 7.5% relative amplitude, Table 5). Although the ERA5-PWV dataset has a higher temporal resolution, the amplitude of the diurnal cycle is for most of the sites smaller for ERA5-PWV than for GAMIT-PWV, with the only clear exception being the SHIS site (with diurnal amplitude of 0.4 mm or 2% higher for ERA5 than for GAMIT-PWV). Apparently, the gridded ERA5-PWV dataset smoothens out the local diurnal variations at the GPS site locations. Because of the stronger diurnal cycle in the GPS-PWV dataset, we have also investigated the GPS-PWV diurnal cycle over Ethiopia for all seasons separately. We concentrate here only on the sites with a clear, significant diurnal variation (i.e., BDMT, ARMI, DJIG, and DEBK). We found that the amplitude of the diurnal cycle depends on the season. Maximum amplitudes between 5 and 6 mm are observed during fall and spring seasons. As rainfall events occur in these seasons, they are characterized by a very green environment at these sites, more water on the surfaces (more water bodies) and with plenty of sunlight. The evapotranspiration from the green plants and water bodies results in the largest observed diurnal cycle in this season. On the other hand, minimum amplitudes of the diurnal cycle (only a few mm) at those sites arise during winter and summer seasons. The winter season is characterized with little observed surface moisture and hence low evapotranspiration, whereas the summer season is characterized with relatively low temperature (Figure 4) with little sunlight, which result in less evapotranspiration from the surface. In this analysis, we do not notice a clear link in the similarity between the diurnal cycles for the sites that have similar seasons (see previous Section 4.2). Clearly, the diurnal variation of the PWV is more governed by local processes and vegetation.
The PWV diurnal cycle peaks in the late evening (18:00 or 20:00 GMT, 21:00-23:00 LT) for the stations ARMI and BDMT, with the peak at 18:00 GMT only at BDMT for June, July and August (JJA) and September, October and November (SON). The minimum is reached around 10:00 GMT at BDMT (12:00 GMT for SON), and around 12:00 GMT at ARMI (14:00 GMT for JJA). Hence, these stations only show a small dependence of the phase of the diurnal cycle on the season. The phase of the maximum of the diurnal cycle at DEBK shifts every season with 2 hours, from 14:00 GMT (SON) to 20:00 GMT during March, April and May (MAM), and back, while the minimum values are obtained during the morning (04:00 GMT or 06:00 GMT, so 07:00 or 09:00 LT). At DJIG, it is the phase of the minimum of the diurnal cycle that shifts every season with 2 hours, from 14:00 in Jun, July and August (JJA) to 18:00 GMT in December, January and February (DJF), and back. The phase of the maximum falls in the late morning (08:00 GMT), except during the summer (04:00 GMT, early morning: 07:00 LT). The diurnal cycle at DJIG is hence more or less reversed as compared to the other sites.

Analysis of the Relationship between GPS-PWV and Heavy Rainfall
Ethiopia has a high degree of risk to natural disasters such as flooding as well as drought. For instance, in our time period considered here, 2013, 2016, 2019, and 2020 were years marked with major flooding events in the country, while 2015 was a drought year [63,64]. In this section, we want to elaborate on the potential of a GPS network in Ethiopia for nowcasting applications to predict heavy rainfall events. Zhang et al. [65] stated, generally, that PWV could be used to complement conventional meteorological observations for the monitoring and predictions of severe weather events. Indeed, many studies have confirmed that PWV variations can effectively reveal the occurrences and the life cycles of precipitation events [3][4][5]. For example, Priego et al. [4] found that three GPS stations on the Spanish Mediterranean coasts show a quick and clear increase in PWV (around 30 kg/m 2 ) a few hours before the onset of heavy precipitation. In addition, the maximum value of PWV occurs almost simultaneously with the peak intensity of rain. Based on the analysis of a number of case studies of intense precipitation in the Lisbon area, Benevides et al. [66] found that most intense rainfall events occur after steep ascents in PWV and developed a simple algorithm that forecasts rain in the 6 h after a steep ascent of the GPS-PWV in a single station. Those sharp increases in the GPS-PWV before very intense rainfall events were termed GPS-PWV "jumps" in [5], probably associated with water vapor convergence and the continued formation of cloud condensate and precipitation particles. They explored the pattern in the GPS-PWV time-derivation before a heavy rainfall event for use in nowcasting applications.
In this study, we investigate heavy rainfall events over the entire periods of available data for the BDMT (2013-2016) and ARMI (2013-2015) sites. These sites lie in regions which are sensitive to flooding events, and those have the largest number of heavy rainfall events, as compared to the other sites of our sample. Therefore, Figure 6 presents the time series of GAMIT-PWV, ERA5-PWV and precipitation for the two sites. This figure shows that periods of intense rainfall (indicated by horizontal red line and orange rectangle) are typically associated with a PWV peak, while a PWV peak does not necessarily mean that there is or will be precipitation. Therefore, we analyzed the time-varying characteristics of PWV from about 12 h before, during, and 12 h after heavy rainfall events. Based on the standard deviation of the rainfall records at these stations, we set a threshold value of 5 times the standard deviations to characterize heavy rainfall periods. This yields a threshold value of 46.34 and 38.38 mm for BDMT and ARMI respectively. Based on this criterion, we determined nine periods of heavy rainfall for BDMT and twelve periods for ARMI. We first discuss two specific cases. The maximum amount of rainfall recorded at ARMI was 86.2 mm on 15 October 2013. On that day, the GAMIT-PWV started to rise 12 h before the occurrence of the heavy rain from 31.34 to 37.17 mm (i.e., a relative change of 16%) and then dropped to 30.23 mm (i.e., −18%) 12 h after the heavy rainfall event.
As another example, we mention the maximum amount of rainfall recorded at BDMT (90.8 mm) that happened on 2 September 2016. Also, for this case, we observed a rise in PWV (starting from 29.68 to 36.36 mm i.e., an increase of 18%) during 12 h before the rainfall to reach a peak at 36.36 mm and then a decrease down to 22.59 mm (i.e., −38%) 12 h after the start of the rainfall event. This analysis was carried out for all 21 heavy rainfall cases identified. In these cases, we observed a steady rise in PWV from 12 hours before the start of the rainfall (with around 25% on average), when the PWV reaches its peak value, followed by a steady PWV decrease, to obtain again the initial PWV value after about 12 h (relative decrease of almost 25% again). Hence, for these extreme rainfall cases, there is a steady build-up of water vapor in the atmosphere before the heavy rainfall starts at the peak value of PWV. Afterwards, the pouring rain dries out the atmosphere again. As such, monitoring the PWV could be used as an indicator in a nowcasting tool for heavy precipitation, for the two Ethiopian sites considered here.

Conclusions and Outlook
Due to its complex topography and synoptic-scale spatiotemporal circulation patterns, Ethiopia is a very challenging country for studying the spatiotemporal variability of temperature, rainfall, and precipitable water vapor. Unfortunately, observational weather and climate data are sparse in the country, due to availability and accessibility problems, and a lack of data continuity. Although relatively low cost and with limited maintenance, ground-based GPS devices retrieving the PWV are rather limited in space and time as well in Ethiopia. In this paper, we relied on a sample consisting of eight sites in Ethiopia and one site close to the eastern Ethiopian border (Djibouti) during a maximum common time period of 2013-2020. Only two of our sites have data before 2013, other Ethiopian GPS sites with data in 2007-2011 [24,25] were decommissioned. Over data-sparse regions, reanalysis products are often taken as an alternative solution to assess the spatiotemporal variability of essential climate variables like water vapor, but a proper evaluation of their strengths and weaknesses, especially when data assimilation is rather restricted, should not be overlooked.
Compared to earlier studies over Ethiopia, we used the latest ECMWF reanalysis product, ERA5, to retrieve PWV values. We evaluated the ERA5-PWV output at the nine GPS site locations in (or close to) Ethiopia to assess whether the improvement of ERA5 with respect to previous reanalyses in terms of temperature and rainfall is also reflected in terms of the precipitable water vapor. More specifically, the following two questions were addressed: (1) Are there potential systematic PWV biases that depend on the very complex orography? (2) Does the higher spatial and temporal resolution of ERA5 compared to his predecessor ERA-Interim amount to a closer agreement with GPS in terms of the PWV seasonal and diurnal cycle? This is the first study assessing the ERA5-PWV in Ethiopia.
First, we validated our GAMIT ZTD retrievals with the IGS ZTD product at 18 IGS sites with long data series (2013-2020). Additionally, both the ZTD processing and the ZTD to PWV conversion at Addis Ababa have been evaluated with radiosonde observations. Those comparisons confirmed the good quality of our GAMIT-PWV processing, giving confidence in the reliability of the generated GAMIT-PWV dataset at the Ethiopian sites. The GAMIT-PWV achieves a good correlation with the ERA5-PWV at the Ethiopian GPS sites, with correlations exceeding 0.96. For the majority of the sites, GAMIT-PWV shows a wet bias (within 2 mm) with respect to ERA5-PWV. The largest wet biases for GAMIT-PWV are obtained for stations with a significant height difference between the GPS site and the ERA5 surface grid (ASOS: 3.62 mm wet bias, 865 m height difference, and DEBK: 2.84 mm wet bias, 1218 m height difference). An improvement of the height difference correction scheme (between the GPS site and the ERA5 surface grid at the site location) is therefore envisaged. Here, as opposed to earlier analyses [24,25] that concluded that GPS-PWV had a dry bias compared to ERA5's predecessor ERA-Interim over Ethiopian lowlands, and a wet bias over the highlands, we do not find such a clear link between the ERA5-GPS PWV mean differences and station height. Furthermore, the RMSE obtained between GPS and ERA5 in our study are also significantly reduced compared to the values reported in those previous studies between GPS and ERA-Interim.
Making use of the GAMIT-PWV and ERA5-PWV datasets, we also investigated their similarity in capturing the spatiotemporal PWV variability in Ethiopia. Given the rather limited time length of the GPS datasets at our small sample of stations, we restricted ourselves to comparing the seasonal and diurnal PWV cycles in both datasets. We found that the PWV seasonal variability is represented similarly by the two datasets and is also clearly linked with the seasonal cycle of precipitation. This result for PWV is compliant with the finding that ERA5 is able to capture the seasonal cycles of temperature [16] and rainfall in Ethiopia [16,29] very well. In Ethiopia, the variation of temperature within a year is rather modest, and its direct link with the PWV seasonal cycle is less clear. Because of the high temporal resolution of the GAMIT-PWV and ERA5-PWV datasets compared to e.g., [24,25], we also had a closer look at the mean PWV diurnal cycle at the Ethiopian stations. Stations like ARMI and BDMT have higher diurnal amplitudes than the stations in the highlands. These former stations are located in an area where evapotranspiration is high, with very high temperatures.
In the last section, the data of two stations ARMI and BDMT were used to explore the potential of GPS-PWV monitoring for nowcasting severe rainfall events. The larger areas around those stations have been frequently affected by flooding events. We therefore selected heavy rainfall events in the daily precipitation time series at those stations and analyzed the GPS-PWV behavior preceding and following such an event. We found that GPS-PWV tends to increase from several hours before heavy rainfall (about 25% on average from 12 h before the rainfall starts), reaches a peak during rainfall, and decrease after the heavy rainfall (also by about -25% on average at 12 h after the start of the rainfall). This study illustrates the opportunity that a denser network of GPS sites in Ethiopia can provide for setting up a warning system for heavy rainfall.
While ERA5 is much closer to observations than its predecessor, the horizontal resolution of 0.25 • × 0.25 • can still be considered too coarse for assessing the PWV variability over Ethiopia. Additionally, in order to obtain reliable climate projections, especially for extreme precipitation, climate simulations at convection-permitting resolutions are necessary [21,67]. In this study, in which we wanted to assess the present-day representation of PWV by GPS and ERA5, we are spatially limited by the coverage of the GPS network in Ethiopia, which are mainly "point" observations. However, the use of a gridded dataset like ERA5 would enable us to better investigate the spatial variability of the PWV over the entire country of Ethiopia.
Moreover, ERA5 is available since 1979, giving the opportunity to study both PWV and rainfall at Ethiopia over the past decades. In a follow-up study, we will use ERA5 to validate an ensemble of regional climate models over Ethiopia [20,22] in terms of PWV and rainfall. In a next step, we will then investigate the impact of climate change on PWV and extreme rainfall. Funding: This work has been partly funded by the Solar-Terrestrial Centre of Excellence (STCE), supported by the Belgian Federal Science Policy Office.