An ERA5-Based Hourly Global Pressure and Temperature (HGPT) Model

: The Global Navigation Satellite System (GNSS) meteorology contribution to the comprehension of the Earth’s atmosphere’s global and regional variations is essential. In GNSS processing, the zenith wet delay is obtained using the di ﬀ erence between the zenith total delay and the zenith hydrostatic delay. The zenith wet delay can also be converted into precipitable water vapor by knowing the atmospheric weighted mean temperature proﬁles. Improving the accuracy of the zenith hydrostatic delay and the weighted mean temperature, normally obtained using modeled surface meteorological parameters at coarse scales, leads to a more accurate and precise zenith wet delay estimation, and consequently, to a better precipitable water vapor estimation. In this study, we developed an hourly global pressure and temperature (HGPT) model based on the full spatial and temporal resolution of the new ERA5 reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). The HGPT model provides information regarding the surface pressure, surface air temperature, zenith hydrostatic delay, and weighted mean temperature. It is based on the time-segmentation concept and uses the annual and semi-annual periodicities for surface pressure, and annual, semi-annual, and quarterly periodicities for surface air temperature. The amplitudes and initial phase variations are estimated as a periodic function. The weighted mean temperature is determined using a 20-year time series of monthly data to understand its seasonality and geographic variability. We also introduced a linear trend to account for a global climate change scenario. Data from the year 2018 acquired from 510 radiosonde stations downloaded from the National Oceanic and Atmospheric Administration (NOAA) Integrated Global Radiosonde Archive were used to assess the model coe ﬃ cients. Results show that the GNSS meteorology, hydrological models, Interferometric Synthetic Aperture Radar (InSAR) meteorology, climate studies, and other topics can signiﬁcantly beneﬁt from an ERA5 full-resolution model.


Introduction
The electromagnetic waves emitted from a satellite source when propagating through the atmosphere are affected by the electron content in the ionosphere and by the neutral atom and molecule densities in the troposphere [1]. This effect is observed as a path delay induced in the measurement of the travel time, resulting in an increase in the travel path length. The free electron-microwave interaction is dispersive, meaning that the ionospheric delay contribution depends on the signal

HGPT Model Formulation
In this section, we introduce the model formulation. The main differences between this model and the models described previously are (1) the introduction of a quarterly periodicity for the temperature to account for the intraseasonal signal fluctuations caused by the Madden-Julian oscillation (MJO) system; (2) the introduction of the rate of change to account for a global climate change scenario; (3) the use of a time-segmentation concept, which is focused on a set of hourly coefficients; and (4) the use of the high horizontal (and vertical) resolution offered by the NWMs. The output of the HGPT model includes the surface air temperature (T), the surface pressure (P), the zenith hydrostatic delay (ZHD), and the weighted mean temperature (T m ) for a specific geographic location and time.

Global Temperature and Pressure
The model is based on a time-segmentation concept; in other words, simulations are extracted every hour from the 1-h temporal resolution time series, obtaining 24 time series, one for each hour of the day, with a 24-h temporal resolution for each grid point. Based on these data, it is possible to obtain different global coefficients for the surface air temperature and pressure at each hour and grid point using Fourier analysis as a function of the location and time [19]. We also introduced the linear trend to account for local long-term temperature trends. For the surface air temperature (T h ), three periodic functions were considered to account for the annual, semi-annual, and quarterly periodicities, as given by Equation (1): 365. 25 where t is the modified Julian date (MJD); t 0 is the date of the first observation; h is the hour (UTC) derived from t; a 1 , a 2 , and a 3 are the annual, semi-annual, and quarterly amplitudes, respectively; f 1 , f 2 , and f 3 are the annual, semi-annual and quarterly initial phases, respectively; and a and b are the regression coefficients. The amplitudes, initial phases, and the linear coefficients are bilinearly interpolated for the requested geographic location and for each hour. For the surface pressure (P h ) only the annual and semi-annual periodicities were considered (Equation (2)): (2) We omitted the quarterly periodicity, since it has no significant amplitude signal.

Zenith Hydrostatic Delay (ZHD)
The zenith hydrostatic delay (ZHD) is essentially due to the dry gases in the atmosphere, accounting for approximately 90% of the zenith total delay (ZTD = ZHD + ZWD). It can be calculated from empirical models that are a function of the surface data, namely pressure and temperature. The most used are from Saastamoinen [20] and Hopfield [6]. In this study we used the Saastamoinen model modified by Davis et al. [21], as given by Equation (3): where P s is the surface pressure in hPa, ϕ is the latitude of the station, and h s is the altitude above sea level in meters. The ZHD is calculated using the pressure modeled by means of Equation (2), and h s by subtracting the geoid height at the station location. The Saastamoinen model has been extensively studied in the literature, and it provides high global accuracies when compared to other models [3, 8,22,23]. However, any alternative model can be easily implemented with the data provided in this study. The accuracy of the Saastamoinen model is nearly proportional to the accuracy of P s . Improving the surface pressure calculation is, therefore, crucial for improving the calculation of the second component, the wet delay, as ZWD = ZTD − ZHD. The ZWD is due to the troposphere water vapor content. Clarifying and quantifying this component is vital to understanding the water vapor dynamics in the atmosphere and its interaction with various elements that control weather and climate systems.

Weighted Mean Temperature (T m )
The precipitable water vapor (PWV) is the water equivalent of the integrated water vapor amount per unit area. The PWV is expressed in length units and is linearly proportional to the ZWD [9]. The relation is given by PWV = Π · ZWD [10], in which Π is a conversion factor, and can be obtain through Equation (4): where ρ w is the liquid water density; R w is the specific gas constant for water vapor; k 1,2,3 are refractivity constants (as given by the best average solution proposed by Rüeger [24]); and k 2 = k 2 − m, with m Remote Sens. 2020, 12, 1098 5 of 18 being the ratio of the molar masses of water vapor and dry air. T m is the atmosphere weighted mean temperature calculated along a vertical profile, which is defined as [9]: where e is the partial water vapor pressure in hPa and T is the air temperature in K. T m is the only unknown parameter in Equation (4) and depends on the vertical profiles of e and T, which are only measured by radiosondes or simulated by NWMs. Since the vertical profiles of e and T at GNSS sites are very uncommon, T m can be expressed as a linear function of the surface temperature, T s . Improving the weighted mean temperature accuracy is essential to provide a reliable PWV. To calculate the weighted mean temperature, T m , we used a linear equation: where α and β are the linear regression coefficients. To account for the seasonality and geographic variability intrinsic to T m , the coefficients are estimated using a time series of surface air temperature and air temperature (in K) provided by NWMs.

Data and Model Computation
The coefficients of the model [25] were estimated using the ERA5 reanalysis data. ERA5 is the fifth generation of the ECMWF atmospheric reanalysis that has been operational at ECMWF since 2016; it covers the period from 1979 till the present (from 1950 by early 2020), and has several significant innovative features beyond that of the discontinued ERA-Interim reanalysis model. The major improvement is attributed to an increase in the horizontal grid spacing (from 79 to 31 km), in the number of model levels (from 60 to 137), and in the temporal resolution (from 6 to 1 h), which enables an improved atmospheric representation of convective systems, gravity waves, tropical cyclones, and other meso-to synoptic-scale atmospheric structures [26]. Another ERA5 improvement is the number of observations that are assimilated, which went from on average 0.75 million per day in 1979 to about 24 million per day in 2018, boosted mainly by the increase of satellite radiances throughout the period, and more recently, by the GNSS-Radio Occultation, scatterometer ocean vector wind and altimeter wave height data, ozone products, and ground-based radar observations [27].
We used 20 years (between 1999 and 2018) of surface air temperature and pressure at the full ERA5 spatial resolution of 0.25 • × 0.25 • (about 28 km) with a grid of 721 grid points in latitude and 1440 in longitude (a total of 1,038,240 data points), and a 1-h temporal resolution. Note that the 0.25 • × 0.25 • resolution is a bilinear interpolation of the native resolution of 0.28125 • × 0.28125 • (31 km). This step is necessary due to limitations in the ECMWF's NetCDF (Network Common Data Form) implementation. For each grid point, we have a time series of 175,320 surface air temperature and pressure simulations between 1 January 1999, 00:00 UTC, and 31 December 2018, 23:00 UTC, totaling about 3.64 × 10 9 simulations. Following the time-segmentation concept, we extracted from the 1-h temporal resolution time series the simulations at the same hour for each time series, obtaining 24 time series with a 24-h temporal resolution at each grid point. The linear coefficient values were obtained using a linear regression model. The amplitude and initial phase coefficients were obtained using a Fourier analysis. After detrending and removing the mean of the time series, a fast Fourier transform (FFT) was applied. The amplitudes were obtained after identifying the frequency that corresponded to each periodicity and the initial phase was given by the inverse tangent of the FFT. Both methods were applied at each hour and grid point (about 25 million) for temperature and pressure. All temperature coefficients were saved in an external binary file, but only the grid for a requested hour was loaded at every run of the HGPT subroutine to save time and improve efficiency. Likewise, the pressure coefficients were also saved in a binary file, and as for temperature, only the grid for the requested hour was loaded. The temperature and pressure coefficients obtained were referred to as the ERA5 surface elevation grid. The ERA5 vertical datum was the mean sea level and was based on the WGS84 Earth gravitational model (EGM 96) geoid [25,28]. The HGPT model uses the pressure and temperature lapse rate (see Xu [2], page 56) to convert pressure and temperature to the desired height. For GNSS data processing, we needed to consider that the GNSS heights are referenced to the WGS84 ellipsoid. To transform the GNSS ellipsoidal heights to the mean sea level system such that they become geoid-based, the following relation is used: where h is the GNSS ellipsoidal height, N is the geoid height (undulation), and H is the GNSS orthometric height (at a global mean sea level). We used the EGM96 geopotential model to degree and order 360 to obtain a geoid height (in meters) at the same ERA5 horizontal grid points [29]. Applying Equation (6) implies the determination of α and β coefficients for each grid point. For this, we used 20 years of monthly surface air temperature, air temperature, specific humidity, geopotential, and pressure at 137 model levels (from surface pressure to 0.01 hPa), which is 77 levels more than ERA-Interim. Applying Equation (5) implies the calculation of the partial pressure of water vapor, e, at all model levels. The following equation was used: where P is the pressure in hPa; q is the specific humidity in kg/kg; and ε is the ratio between the molar masses of water and dry air, which is equal to 0.622. After this step, Equation (5) was used to obtain a monthly temporal resolution time series for T m , integrating along vertical profiles, one per grid point and month. The regression coefficients were obtained using the least-squares approach. From this, we obtained the globally gridded coefficients α and β, and saved them in a binary file. Both were bilinearly interpolated at the requested geographic location.

Results and Discussion
In this section, we present and discuss the mean global distribution of the a and b coefficient values, and each amplitude considered in the temperature and pressure models. The horizontal variation of the amplitude was analyzed at the ERA5 resolution (in comparison to the resolution used by the other models), and the introduction of a trend instead of the global mean was quantified. Figure 1 shows the global distribution of the mean coefficients of surface air temperature. The a coefficient exhibited a minimum at the poles and a maximum near the equator, which also decreased with increasing altitude, e.g., such as in the Himalayas, Andes, or the Rocky Mountains ( Figure 1a). This parameter approximately corresponded to the normal temperature profile, but it cannot be directly used as an indicator of the mean temperature. Globally, mean temperature trends were higher over continental areas than in sea areas, except for northern Russia (from Barents Sea to Chukchi Sea). Nevertheless, high-temperature trends were also observed over oceanic regions, such as the Pacific Ocean in the south of California and the Atlantic Ocean near the northeastern United States (Figure 1b). Figure 1f shows the impact of introducing a linear trend in the temperature model as an alternative of just using the mean temperature. There was a difference of around 20% at the Equator and Colombia coast, 5% to 10% at the Pacific Ocean (between 10 • S and 30 • N), the Indian Ocean close to Madagascar, near the African Sahel, and at higher latitudes (northern Russia). Globally, there was a mean value close to 1%. Regarding the seasonal components, Figure 1c shows that the annual amplitude increased with latitude and was larger for terrestrial areas, with the maximum amplitudes in Siberia, Mongolia, northeast of China, Alaska, and north of Canada. Figure 1d shows that the amplitude of the semi-annual component was larger over terrestrial areas, reaching maximum amplitudes over the Antarctic, but it was also significant for Greenland, the African Sahel, and India (due to the monsoons). The quarterly mean amplitude had a maximum value of 2 K over Siberia, and between 1 to 1.5 K in the east of Africa, specifically Ethiopia and Somalia, in the north of Canada, Himalayas, and some parts over Antarctic (Figure 1e). This periodicity was related to the major mode of tropical atmospheric variability on intraseasonal time scales, namely the Madden-Julian oscillation (MJO) system [30]. There was a large-scale link between the atmospheric circulation and tropical deep atmospheric convection, specifically in the northern temperate and polar zones, due to variations in the (1) Arctic oscillations, (2) rainfall in East Asia, (3) North American monsoon system, (4) Indian monsoon, and (5) surface air temperature at high latitudes in the Northern Hemisphere during winter [31,32]. Southern polar zone variations in the Antarctic oscillations are also related to the MJO [33]. Including this periodicity in the model led to an RMSE improvement of between 2.5% to 5% over the mentioned areas. Globally, the mean improvement due to the quarterly signal was 0.2%.
To further justify the high spatial resolution, the hourly mean maximum variability and standard deviation for each periodicity was calculated by using a window of 4 × 4 grid points, which was Remote Sens. 2020, 12, 1098 8 of 18 approximately the 1 • × 1 • horizontal grid that is normally adopted by other models. Figure 2 shows the maximum variability found for each amplitude. For the annual amplitude, differences up to 10 K in windows with a standard deviation of 4 K (not shown) were observed (Figure 2a). These variations were found along seacoasts, inland coasts, and around elevated areas, e.g., the Himalayas. These variations were expected in those areas due to complex weather dynamics between the ocean and land, or due to abrupt orographic changes, leading to larger temperature gradients in those areas. The semi-annual amplitude showed differences up to 3 K with a standard deviation of 1.5 K (Figure 2b). Larger values were found not only for the same places as in the annual amplitude differences, but also in elevated areas, e.g., Rocky Mountains, and near the equator, e.g., the African Sahel and northern South America. The quarterly amplitude showed differences up to 1 K with a standard deviation of 0.4 K for the same places ( Figure 2c).
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 17 between 14:00 and 15:00 UTC. For the semi-annual amplitude, we found 1-hour variations of up to 1.7 K over India, north of China, Mongolia, and Siberia, between 01:00 and 02:00 UTC. For the quarterly amplitude, 1-hour variations reached 0.6 K over Siberia, between 21:00 and 22:00 UTC. The spatial and temporal variations found for the annual and semi-annual periodicities justified the use of the ERA5 full resolution model. It was obvious that the HGPT model not only improved the surface air temperature at seacoasts, where a considerable number of GNSS stations were available, but also the hydrological modeling at inland coasts. Considering the temperature of two locations with distinct climatological and topographical characteristics, for example, the Sahara desert (λ = 0°, φ = 25°N) and the Amazon Rainforest (λ = 65°W, φ = 0°) for a one-year time-span (2018), one can observe from Figure 3 that the time-segmentation model reproduced the diurnal cycle for both locations quite well.
Maximum spatial variability for each surface air temperature periodicity, which was calculated using a window of 4 × 4 grid points, approximately a 1° × 1° horizontal grid: (a) maximum spatial variability for the annual amplitude (K), (b) maximum spatial variability for the semi-annual amplitude (K), and (c) maximum spatial variability for the quarterly amplitude (K). The estimated coefficients for the surface pressure are shown in Figure 4. The coefficient values were lower inland and decreased considerably as the altitude increased, reaching a minimum in the Himalayas, followed by the Andes, the Antarctic, and Greenland. The mean trend for surface pressure was negative at the polar regions and positive in temperate zones. Introducing a linear trend in the pressure model as an alternative of just using the mean pressure led to an improvement of up Maximum spatial variability for each surface air temperature periodicity, which was calculated using a window of 4 × 4 grid points, approximately a 1 • × 1 • horizontal grid: (a) maximum spatial variability for the annual amplitude (K), (b) maximum spatial variability for the semi-annual amplitude (K), and (c) maximum spatial variability for the quarterly amplitude (K).
By combining the three seasonal components' maximum values, we obtained a spatial temperature variation of 14 K inside a window of 1 • . This variation could lead to a ZWD variation of up to 2.5 mm. A similar analysis was performed for the temporal resolution (not shown). We found 1-h variations up to 4 K for the annual amplitude in the west part of the United States and Canada, between 14:00 and 15:00 UTC. For the semi-annual amplitude, we found 1-h variations of up to 1.7 K over India, north of China, Mongolia, and Siberia, between 01:00 and 02:00 UTC. For the quarterly amplitude, 1-h variations reached 0.6 K over Siberia, between 21:00 and 22:00 UTC. The spatial and temporal variations found for the annual and semi-annual periodicities justified the use of the ERA5 full resolution model. It was obvious that the HGPT model not only improved the surface air temperature at seacoasts, where a considerable number of GNSS stations were available, but also the hydrological modeling at inland coasts. Considering the temperature of two locations with distinct climatological and topographical characteristics, for example, the Sahara desert (λ = 0 • , ϕ = 25 • N) and the Amazon Rainforest (λ = 65 • W, ϕ = 0 • ) for a one-year time-span (2018), one can observe from Figure 3 that the time-segmentation model reproduced the diurnal cycle for both locations quite well.
(c) Figure 2. Maximum spatial variability for each surface air temperature periodicity, which was calculated using a window of 4 × 4 grid points, approximately a 1° × 1° horizontal grid: (a) maximum spatial variability for the annual amplitude (K), (b) maximum spatial variability for the semi-annual amplitude (K), and (c) maximum spatial variability for the quarterly amplitude (K). The estimated coefficients for the surface pressure are shown in Figure 4. The coefficient values were lower inland and decreased considerably as the altitude increased, reaching a minimum in the Himalayas, followed by the Andes, the Antarctic, and Greenland. The mean trend for surface pressure was negative at the polar regions and positive in temperate zones. Introducing a linear trend in the pressure model as an alternative of just using the mean pressure led to an improvement of up The estimated coefficients for the surface pressure are shown in Figure 4. The a coefficient values were lower inland and decreased considerably as the altitude increased, reaching a minimum in the Himalayas, followed by the Andes, the Antarctic, and Greenland. The mean trend for surface pressure was negative at the polar regions and positive in temperate zones. Introducing a linear trend in the pressure model as an alternative of just using the mean pressure led to an improvement of up to 5% in the Andes region, and about 2% in the Pacific and Atlantic Oceans, as well as some regions of Africa (Figure 4e). The mean annual amplitude was larger than 15 hPa in Greenland and parts of Asia, Australia, and some parts of the Antarctic (varies between 5 and 10 hPa), reaching a minimum near the equator. The mean semi-annual amplitude reached up to 4 hPa in the Antarctic and the southern and northern Pacific Ocean. To justify the high spatial and temporal resolutions, we performed the same analysis as described previously. Figure 5 shows the maximum variability found for each amplitude. For the annual amplitude, we found differences of up to 12 hPa in windows with a standard deviation of 4.5 hPa (not shown). Most of these variations can be seen in Asia in the boundaries between plain and elevated areas. The semi-annual amplitude variations reached up to 2 hPa with a standard deviation of 0.5 hPa, which were located at seacoasts, mainly in the Antarctic and Greenland. Combining these differences could lead to variations of up to 32 mm in the ZHD using the Saastamoinen formula. Regarding the temporal resolution, we found 1-h variations for the annual amplitude of up to 1 hPa in some parts of Asia, the African Sahel, and in the Pacific Ocean between 09:00 and 10:00 UTC. The 1-h variations for the semi-annual amplitude reached 0.5 hPa in several locations, such as south of Africa, west of the United States, and in the southern Atlantic Ocean, for the same period.
to 5% in the Andes region, and about 2% in the Pacific and Atlantic Oceans, as well as some regions of Africa (Figure 4e). The mean annual amplitude was larger than 15 hPa in Greenland and parts of Asia, Australia, and some parts of the Antarctic (varies between 5 and 10 hPa), reaching a minimum near the equator. The mean semi-annual amplitude reached up to 4 hPa in the Antarctic and the southern and northern Pacific Ocean. To justify the high spatial and temporal resolutions, we performed the same analysis as described previously.   Figure 5 shows the maximum variability found for each amplitude. For the annual amplitude, we found differences of up to 12 hPa in windows with a standard deviation of 4.5 hPa (not shown). Most of these variations can be seen in Asia in the boundaries between plain and elevated areas. The semi-annual amplitude variations reached up to 2 hPa with a standard deviation of 0.5 hPa, which were located at seacoasts, mainly in the Antarctic and Greenland. Combining these differences could lead to variations of up to 32 mm in the ZHD using the Saastamoinen formula. Regarding the temporal resolution, we found 1-hour variations for the annual amplitude of up to 1 hPa in some parts of Asia, the African Sahel, and in the Pacific Ocean between 09:00 and 10:00 UTC. The 1-hour variations for the semi-annual amplitude reached 0.5 hPa in several locations, such as south of Africa, west of the United States, and in the southern Atlantic Ocean, for the same period. The estimated regression coefficients, and , that were used in the estimation of the weighted mean temperature ( ) from the surface air temperature ( ) data are shown in Figure 6. The coefficient had a mean global value of 66.6 ± 17.5 K, a mean value of 97.6 ± 20.4 K over land, and a mean value of 45.9 ± 21.6 K over the ocean (Figure 6a). The coefficient also displayed an insignificant spatial variability for latitudes higher than 30°N and lower than 30°S; the largest variability in those regions was verified to be over the ocean. The coefficient showed a similar behavior, with a mean global value of 0.73 ± 0.06, a mean value of 0.61 ± 0.07 over land, and a mean value of 0.81 ± 0.07 over the ocean. Over land, and excluding the tropical zone, this coefficient displayed a slight variability, with a mean value of 0.63 ± 0.05 (Figure 6b). Around the equator (from 30°S to 30°N, i.e., the tropical The estimated regression coefficients, α and β, that were used in the estimation of the weighted mean temperature (T m ) from the surface air temperature (T s ) data are shown in Figure 6. The α coefficient had a mean global value of 66.6 ± 17.5 K, a mean value of 97.6 ± 20.4 K over land, and a mean value of 45.9 ± 21.6 K over the ocean (Figure 6a). The coefficient also displayed an insignificant spatial variability for latitudes higher than 30 • N and lower than 30 • S; the largest variability in those regions was verified to be over the ocean. The β coefficient showed a similar behavior, with a mean global value of 0.73 ± 0.06, a mean value of 0.61 ± 0.07 over land, and a mean value of 0.81 ± 0.07 over the ocean. Over land, and excluding the tropical zone, this coefficient displayed a slight variability, with a mean value of 0.63 ± 0.05 (Figure 6b). Around the equator (from 30 • S to 30 • N, i.e., the tropical zone), both coefficients showed a significant variability over both land and sea. This behavior can be explained in terms of the tropical atmospheric circulation (Hadley cells) and the ocean currents formed in this region, but more specifically by the Intertropical Convergence Zone (ITCZ) system. In tropical regions, the ITCZ transfers ocean heat and moisture from the lower levels of the atmosphere to the upper levels of the troposphere and to medium and high latitudes. These phenomena create rapid fluctuations in the air temperature and moisture with a slow impact on the surface air temperature. The ITCZ shifts during the year and tends to be located where the sun's rays strike the ground more directly, i.e., during each hemisphere's respective summers [34][35][36]. This phenomenon is easily identified in Figure 6c, where the distinct belts in both hemispheres stand out. Figure 6c shows the Pearson correlation coefficient between the T m , obtained using numerical integration, and the T s . It shows a mean global value of 0.82 ± 0.06, a mean value of 0.91 ± 0.10 over land, and a mean value of 0.79 ± 0.09 over the ocean. By removing the influence of the tropical zone, a stronger linear correlation (higher than 0.9) was obtained. Figure 6d shows the root mean square error (RMSE) between the T m obtained using a numerical integration and the T m calculated using the α and β coefficients, and the T s . We obtained a global error of 1.09 ± 0.08 K, an error of 1.28 ± 0.09 K over land, and an error of 0.99 ± 0.11 over the ocean. Areas with the highest RMSE values were Hudson Bay, connected to the Atlantic Ocean, showing lower salinity levels and covered by ice year-round, and therefore low evaporation rates, and the Okhotsk Sea, in the Pacific Ocean, also with low evaporation rates due to lower salinity levels and ice cover [37]. This can be an indicator of a deficient thermodynamic sea ice model simulation for large bays in the ERA5. However, this indicator needs further study to evaluate its feasibility, which is beyond the scope of this work.
Several studies have been carried out, using different techniques, to obtain T m , and several global and local models have been proposed. Bevis et al. [10] used 8718 radiosonde profiles at 13 U.S. sites over 2 years and proposed the following equation T m = 70.2 + 0.72T s , which has been extensively used in GNSS meteorology, mainly in the Northern Hemisphere. Bevis [9] compared 502 radiosonde profiles with 12-h forecasts from the National Meteorological Center's nested grid model to find an RMSE of 2.4 K and concluded that further improvements are possible with models at the highest resolutions. Mendes [3] proposed a linear model based on 32,500 radiosonde profiles over one year at 50 sites between 62 • S and 83 • N. Wang et al. [38] used the ERA40 reanalysis with a 1.125 • × 1.125 • grid, 60 hybrid vertical levels, and a 6-h temporal resolution to provide a global T m model. Yao et al. [39] combined radiosonde profiles with the GPT model to create virtual stations over sea, compensating for the lack of data in ocean areas. T m shows a substantial seasonal and geographic variability. The T m and T s values are more correlated in temperate zones and frigid zones, and less correlated in the tropic zone. Furthermore, the correlation tends to decrease in summer and increase in winter [40]. The same authors include the seasonal and geographic dependence in the regression coefficients, as well as regional T m models [41][42][43][44][45][46][47][48]. An assessment of global and regional T m models can be seen in References [40,49]. The PWV accuracy is proportional to the T m accuracy [9]. Improving the PWV accuracy is fundamental for severe weather monitoring and climate study, as the T m modeling still leaves considerable room for improvements when using high-resolution NWM data.
1 Figure 6. (a,b) α and β regression coefficients, obtained using 20 years of monthly ERA5 data, which were used to estimate the weighted mean temperature (T m ) from the surface air temperature (T s ). (c) Pearson correlation coefficient between the T m obtained using ERA5 numerical integration at each grid point and T s . (d) RMSE between T m obtained using α, β, and T s , and the T m obtained using ERA5's numerical integration (see Equation (5)). The estimation of the coefficients in the tropic regions could generate artefacts, especially over the sea, when T s did not vary over a range large enough to get a satisfactory estimation of the parameters. We validated the model by comparing the estimated model with the ERA5 surface air temperature and pressure. Figure 7 shows the mean global RMSE and the bias for both parameters. The temperature RMSE values increased as latitude increased, mostly over land, achieving a maximum value of 8 K (close to the Yenisei Gulf, Russia), and a maximum of 6 K over the ocean (in the Antarctic Ocean). It gave mean RMSEs of 4.8 ± 0.4 K, 1.9 ± 0.1 K, and 2.7 ± 0.2 K over land, over the ocean, and globally, respectively. The minimum RMSE values were observed in the tropical zone (as expected, due to the lower temperature variations). Canada, Alaska, Greenland, and Russia showed the worst RMSE, possibly due to the high surface air temperature variability at several temporal scales; furthermore, the air temperature has significantly increased in those regions in the last few decades in comparison to other regions [50][51][52]. Figure 7c shows the temperature bias spatial distribution, with a mean global value of 0.0 ± 0.6 K and a maximum absolute value of 2.8 K. Positive values indicate an HGPT model overestimation (e.g., northern of Canada and western Russia) and negative values indicate an underestimation (e.g., Antarctic regions). The pressure RMSE values also increased as latitude increased, but mostly over the ocean, achieving a maximum RMSE value of 16 hPa in the Antarctic Ocean, and a maximum of 13.4 hPa over land, in northern Russia. It gave mean RMSEs of 7.1 ± 1.0 hPa, 7.6 ± 0.8 hPa, and 7.1 ± 0.3 hPa over land, over the ocean, and globally, respectively. The worst RMSE values were found over the Antarctic Ocean, probably due to the complex atmospheric turbulence caused by the Antarctic circumpolar current (ACC) system [53], but also in the north of the Pacific and the Atlantic Oceans. Like temperature, the minimum RMSE values were found in the tropical zone. Figure 7d shows the pressure bias spatial distribution, with a mean global value of −0.1 ± 0.7 hPa and a maximum absolute value of 2.4 hPa. As stated previously, positive values indicate an HGPT model overestimation (e.g., Greenland) and negative values indicate an underestimation (e.g., western Russia and Scandinavia). complex atmospheric turbulence caused by the Antarctic circumpolar current (ACC) system [53], but also in the north of the Pacific and the Atlantic Oceans. Like temperature, the minimum RMSE values were found in the tropical zone. Figure 7d shows the pressure bias spatial distribution, with a mean global value of −0.1 ± 0.7 hPa and a maximum absolute value of 2.4 hPa. As stated previously, positive values indicate an HGPT model overestimation (e.g., Greenland) and negative values indicate an underestimation (e.g., western Russia and Scandinavia).

Accuracy Assessment
The accuracy assessment was performed using 510 radiosonde observations of surface air temperature and pressure available from the NOAA Integrated Global Radiosonde Archive (IGRA) version 2 [54]. We compared the first air temperature and pressure observations from each radiosonde data, classified as surface observations, for one year (2018), at 00:00 and 12:00 UTC, with the modeled air temperature and pressure values (bilinearly interpolated to the radiosonde sites). Figure 8 shows the RMSE distributions for temperature and pressure at 00:00 UTC. A mean RMSE value of 2.9 ± 1.6 K was obtained for the temperature, with a mean bias value of 0.5 ± 2.1 K. The highest RMSE values were found in the Northern Hemisphere, e.g., northern Russia and the Rocky

Accuracy Assessment
The accuracy assessment was performed using 510 radiosonde observations of surface air temperature and pressure available from the NOAA Integrated Global Radiosonde Archive (IGRA) version 2 [54]. We compared the first air temperature and pressure observations from each radiosonde data, classified as surface observations, for one year (2018), at 00:00 and 12:00 UTC, with the modeled air temperature and pressure values (bilinearly interpolated to the radiosonde sites). Figure 8 shows the RMSE distributions for temperature and pressure at 00:00 UTC. A mean RMSE value of 2.9 ± 1.6 K was obtained for the temperature, with a mean bias value of 0.5 ± 2.1 K. The highest RMSE values were found in the Northern Hemisphere, e.g., northern Russia and the Rocky Mountains. These locations displayed the largest annual amplitude variability compared with other locations (see Figure 1c). The lowest RMSE values were observed in the tropic zone. For the surface pressure, a mean RMSE value of 6.5 ± 2.5 hPa was obtained, with a mean bias value of −1.1 ± 3.8 hPa. The lowest RMSE values were found in the tropic zone. However, we obtained a more random pattern, probably linked to some pressure values, which was observed at different levels, and incorrectly registered as surface observations. A mean RMSE of 2.8 ± 1.5 K and a bias of 0.7 ± 2.0 K were obtained for the air temperature at 12:00 UTC, and a mean RMSE of 6.4 ± 3.1 hPa and a mean bias of −1.1 ± 4.1 K were obtained for the pressure at the same hour. We omitted the figure for this hour since the results were practically identical to those obtained at 00:00 UTC. To understand the magnitude of these values, we performed the same statistical analysis using the original signal from the ERA5 simulations. Table 1 shows the statistical summary obtained from the comparison between the radiosondes values and the HGPT and ERA5 models. On average, the ERA5 RMSE and bias values were about 40% lower when compared to the values obtained for the HGPT model.  Yang et al. [23] compared five different models with distinct global temperature and pressure factors. The best global performance was achieved by a similar time-segmented model, fixing a 2-h temporal resolution and a 2.5 • × 2 • horizontal grid, obtaining a global RMSE of 2.95 ± 2.79 K and 7.87 ± 7.17 hPa for temperature and pressure, respectively. Comparing these results with the results of our study, we obtained an improvement of about 4% in the temperature estimation (from 2.95 ± 2.79 K to 2.85 ± 1.55 K) and about 18% in the pressure estimation (from 7.87 ± 7.17 hPa to 6.45±2.8 hPa). The ZHD can be predicted with a significant accuracy using the Saastamoinen model if accurate surface pressure measurements are available. A global standard deviation of 4.7 hPa was obtained from the difference between the HGPT model and the radiosonde observations. Appling the error propagation law to Equation (3), ignoring any possible errors in the latitude and height of each radiosonde station (Figure 8b), and adopting σP s = 4.7 hPa, we obtained a ZHD global uncertainty of about 10 mm. Around 15% of the stations showed a standard deviation equal or lower than 0.5 hPa, leading to a ZHD uncertainty of about 1 mm. In a similar analysis for temperature, we found a global standard deviation of 1.5 K, and assuming there was no error in α and β (see Equation (6)), we obtained a global T m uncertainty of 1.1 K with a maximum value of 3.5 K over the Atlantic Ocean, close to the Amazon Rainforest ( Figure 9). Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 17 Lan et al. [55] calculated linear regression coefficients between and at every 2° × 2.5° grid point using data from the ERA-Interim model and data from Global Geodetic Observing System (GGOS), and compared the results with the Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) and radiosonde data to obtain accuracies of 3.1 and 3.8 K, respectively. When compared with our results, we obtained a global accuracy improvement of about 33%. Figure 9. Accuracy spatial distribution for considering a global standard deviation of 1.1 K for surface air temperature and ignoring possible errors in and .

Conclusions
In this work, we presented an ERA5-based hourly global pressure and temperature (HGPT) model. The HGPT model follows the time-segmentation concept and was based on 20 years of 0.25° × 0.25° spatial resolution and 1-h temporal resolution data from the ECMWF ERA5 reanalysis model. Lan et al. [55] calculated linear regression coefficients between T m and T s at every 2 • × 2.5 • grid point using T s data from the ERA-Interim model and T m data from Global Geodetic Observing System (GGOS), and compared the results with the Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) and radiosonde data to obtain accuracies of 3.1 and 3.8 K, respectively. When compared with our results, we obtained a global accuracy improvement of about 33%.

Conclusions
In this work, we presented an ERA5-based hourly global pressure and temperature (HGPT) model. The HGPT model follows the time-segmentation concept and was based on 20 years of 0.25 • × 0.25 • spatial resolution and 1-h temporal resolution data from the ECMWF ERA5 reanalysis model. Its mathematical formulation includes a quarterly periodicity for temperature and a linear trend to account for a global climate change scenario. The model provides not only surface air temperature and pressure, but also a weighted mean temperature, as well as a zenith hydrostatic delay for any specific geographic location and time. The function coefficients are bilinearly interpolated in space using an efficient subroutine developed in Fortran, Matlab, and Python, which is available at https://github.com/pjmateus/hgpt_model, along with the coefficient files in binary format. The inclusion of a quarterly periodicity in modeling the surface air temperature improved the performance of the model by up to 5% at the highest latitudes. The use of a linear trend instead of a mean trend adopted by other models improved the performance of both temperature and pressure modeling by up to 20% and 5%, respectively. The spatial resolution that ERA5 offers will greatly improve the ZHD, mainly at seacoasts, where many GNSS stations are located, and in inland areas, enhancing hydrological modeling. The comparison of the HGPT and ERA5 models with observations of surface air temperature and pressure from 510 globally distributed radiosondes provided by IGRA showed a better performance of the ERA5 model. However, the differences between the two models in terms of RMSE and bias values were minimal. For the estimation of temperature, the RMSE value between the two models differed by 1.2 K, while their bias differed by 0.45 K. Likewise, for the pressure, we found differences of 2.7 hPa and 0.3 hPa. Globally, the ERA-based HGPT model increased the precision of the computation of the hydrostatic component by about 18%, equivalent to an increase in the precision of pressure when compared to other studies, which provided a corresponding increase of the precision in the estimate of the ZWD. Furthermore, an increase in the weighted mean temperature T m of about 33% compared to recent global models was obtained. As a consequence of the higher precision of the ZWD and T m , the precision of the estimated PWV increased.