Use of GNSS Tropospheric Delay Measurements for the Parameterization and Validation of WRF High-Resolution Re-Analysis over the Western Gulf of Corinth, Greece: The PaTrop Experiment

: In the last thirty years, Synthetic Aperture Radar interferometry (InSAR) and the Global Navigation Satellite System (GNSS) have become fundamental space geodetic techniques for mapping surface deformations due to tectonic movements. One major limiting factor to those techniques is the effect of the troposphere, as surface velocities are of the order of a few mm yr − 1 , and high accuracy (to mm level) is required. The troposphere introduces a path delay in the microwave signal, which, in the case of GNSS Precise Point Positioning (PPP), can nowadays be partly removed with the use of specialized mapping functions. Moreover, tropospheric stratiﬁcation and short wavelength spatial turbulences produce an additive noise to the low amplitude ground deformations calculated by the (multitemporal) InSAR methodology. InSAR atmospheric phase delay corrections are much more challenging, as opposed to GNSS PPP, due to the single pass geometry and the gridded nature of the acquired data. Thus, the precise knowledge of the tropospheric parameters along the propagation medium is extremely useful for the estimation and correction of the atmospheric phase delay. In this context, the PaTrop experiment aims to maximize the potential of using a high-resolution Limited-Area Model for the calculation and removal of the tropospheric noise from InSAR data, by following a synergistic approach and integrating all the latest advances in the ﬁelds of remote sensing meteorology (GNSS and InSAR) and Numerical Weather Forecasting (WRF). In the ﬁrst produces satisfactory results, with a percentage of ZTD values within the bias margin ranging from 57% in summer to 63% in autumn. the correlation coefﬁcient R of their difference, i.e., between the GNSS ∆ ZTD and WRF ∆ ZTD of stations and KALA, for seasons S1–S4. Thus, the annual and semi-annual component of ZTD variability is removed from the time series and the ability of the model to resolve small-scale inhomogeneities is demonstrated. Results indicate that model performance is better during the second half of the year, as WRF reproduces 54% of the observed ∆ ZTD variability between the two stations in summer (S3) and 52% in autumn (S4) (R is 0.54 and 0.52 respectively). The correlation is slightly weaker during the ﬁrst half, with R = 0.48 in winter (S1) and R = 0.49 in spring (S2).


Introduction
In recent years, and with the advent of high-performance computer systems (HPCs), high-resolution Limited-Area Model (LAM) applications are being increasingly utilized, both in operational weather forecasting and in atmospheric research as weather re-analysis tools. Atmospheric studies involving dynamical downscaling with high-resolution LAMs focus either on the estimation of local meteorological parameters and climatic patterns (detailed wind fields, precipitation characteristics, local humidity etc.) or on the actual validation of the model used, depending on its parameterization and comparison with observational data. Common LAMs which have been employed for research and climate re-analysis include MM5, RAMS and WRF. In recent years, WRF [1] has become the most widely used model, due to its versatility as an open-source package supported by a large user community and the fact that it can be operated efficiently on parallel computing platforms, making it suitable for use in a wide range of applications at high horizontal resolutions (down to 1 km or less).
The use of high-resolution regional weather models (LAMs) nested within coarser, global, weather models is currently gaining ground in the domain of satellite imagery, as the precise knowledge of meteorological parameters along the lower atmosphere is extremely useful for the correction of remote sensing data, and in particular Synthetic Aperture Radar (SAR) interferograms [2][3][4][5][6][7][8]. Tropospheric stratification and short wavelength spatial turbulences produce a phase delay, and consequently an additive noise to the low amplitude ground deformations calculated by the multitemporal InSAR methodology. The removal of the atmospheric phase screen (APS) is a challenging task (as opposed to GNSS, for example, which uses signals in the same microwave bandwidth), due to several reasons, including the small number of satellites, the single pass geometry and the gridded nature of the acquired data. Several methods have so far been pursued to mitigate the tropospheric delay in InSAR data, including local atmospheric data collection [9], GNSS zenithal delay estimations [10][11][12], satellite multispectral imagery analysis [13][14][15], empirical phase/topography estimations [8,[16][17][18], and zenithal delay estimations from Global Atmospheric Models (GAMs) [19][20][21]. These methods have their limitations, as they rely either on local data assimilation, which is rarely available at the desired spatial resolution, or on empirical estimations which are difficult in situations where deformation and topography are correlated.
The advantage of using LAMs over other techniques is the ability to produce detailed delay maps at high spatial resolution (down to 1 km), by estimating atmospheric parameters, such as water vapour and temperature, in the entire vertical column which cannot be easily estimated otherwise. Foster et al. [2,3], employed the MM5 regional model at high horizontal resolution (3 km) to obtain tropospheric delay fields over the Island of Hawaii and Mount St. Helens in the U.S. with mixed success, as the model configuration failed to accurately predict tropospheric delays at shorter wavelengths (under 8 km) in most cases. Wadge et al. [5], performed tropospheric delay corrections, using the Unified Regional Mesoscale Model at high-resolution over Mount Etna in Sicily. The study concluded that LAMs show a promising potential in calculating the path delay due to tropospheric water vapor in regions of high relief and high water vapour variance such as Mount Etna, and that the model was able, under certain conditions, to simulate local modifications of water vapour content by mechanisms such as land-sea breezes. More recently, Bekaert et al. [8] produced tropospheric delay fields with the WRF regional model Remote Sens. 2021, 13, 1898 3 of 20 over three test-regions with complex topography. The model is nested at 7 km horizontal resolution and is initiated with data from the Global Forecast System (GFS). Results are compared with tropospheric delays obtained with other state-of-the-art methods, including MERIS and MODIS spectroscopy, ERA-Interim, and both the conventional linear and power-law empirical methods. The statistical analysis demonstrated that spectrometers provided the largest RMSE reduction, but only under daylight and cloud-free conditions. Phase-based methods (linear and power-law) outperformed the weather models in regions where tropospheric delays were correlated with topography, but in regions where this was less apparent, due to atmospheric turbulence and dynamic local weather, weather models offered better performance. On the whole, recent studies which employ LAMs for the removal of tropospheric artifacts from InSAR data demonstrate the ability of weather models to calculate detailed tropospheric delay fields under any atmospheric conditions and at the exact times of SAR acquisitions, thus offering a reliable tool for tropospheric corrections. Indeed, it is evident that in cases of atmospheric turbulence and dynamic local weather, weather models can offer better performance [8,22]. However, the generic configuration and parameterization of the LAMs used in these studies have prohibited, so far, the full exploitation of the method.
Our work aims to maximise the potential of using LAMs for the calculation and removal of the tropospheric component from SAR interferograms by evaluating a novel methodology which integrates the latest advances in the fields of remote sensing meteorology (GNSS and InSAR) and high-resolution Numerical Weather Forecasting (WRF). Our research focuses on the interaction of the three aforementioned techniques in the area of the western Gulf of Corinth in Greece and aims to optimise their synergic output by identifying strengths, weaknesses and uncertainties with respect to the measurement of zenithal tropospheric delays. We investigate the extent to which a high-resolution WRF 1 km re-analysis can produce detailed tropospheric delay maps of the required accuracy, by coupling its output, in terms of Zenith Total Delay (ZTD), with the vertical delay component in GNSS measurements. One of the main limitations in the estimation of precise ZTD fields is the presence of highly variable water vapour signals, both in space and in time, which are exhibited in the differential atmosphere as densely distributed short-wavelength phase gradients. It is expected that the high horizontal resolution and dense vertical layering of the model will be capable of capturing near-surface atmospheric processes such as sea breezes, orographic flows, turbulent boundary layer interactions, etc., which influence the distribution of water vapour in settings of complex topography, thus overcoming this important limitation.
The model is initially operated with varying parameterization in order to demonstrate the best possible configuration, with GNSS measurements providing a benchmark of real atmospheric conditions. The two datasets (predicted and observed) are compared and statistically evaluated for a period of one year, in order to investigate the extent to which meteorological parameters that affect ZTD, can be simulated accurately by the model under different weather conditions. In the next phase, we compare twenty Sentinel-1A interferograms with differential delay maps at the LOS produced by WRF re-analysis. Results of the second phase of the experiment are presented in a separate paper.

Description of the Study Area and Experimental Setup
The experiment, with the code name PaTrop (Patras-Troposphere), was implemented for providing the data needed for this study. The PaTrop test site covers an area of approximately 150 × 90 km in the region of the Western Gulf of Corinth (GoC). It is identified as a site of intense geophysical activity and forms one of the most active intracontinental rifts in the world. Geodetic studies conducted over the past 20 years based on GNSS and InSAR observations have revealed North-South extension rates up to 1.5 cm yr −1 [23,24], one of the highest worldwide. In this context, the existing network of permanent GNSS stations used by the Corinth Rift Laboratory (CRL) to continuously Remote Sens. 2021, 13,1898 4 of 20 monitor surface displacements in the area could be exploited for the needs of our study. A network of nineteen permanent Topcon GB1000 and Topcon Net G3A GNSS receivers fitted with Topcon PG-A1 antennas provided the GNSS data and the subsequent in-situ zenithal tropospheric delay measurements, as shown in Figure 1b. Ten of those receivers were installed during the PaTrop campaign in order to expand the existing CRL network. The stations' locations were selected to cover the whole geographic extent of the study area, while capturing a variety of different topographical and meteorological conditions (i.e., coastal, inland, or mountainous terrain), at elevations between 0 and 1020 m above sea level (ASL). This was deemed necessary, in order to account mainly for water vapour variations resulting from orographic, coastal, and frontal gradients that could be present.  [23,24], one of the highest worldwide. In this context, the existing network of permanent GNSS stations used by the Corinth Rift Laboratory (CRL) to continuously monitor surface displacements in the area could be exploited for the needs of our study. A network of nineteen permanent Topcon GB1000 and Topcon Net G3A GNSS receivers fitted with Topcon PG-A1 antennas provided the GNSS data and the subsequent in-situ zenithal tropospheric delay measurements, as shown in Figure 1b. Ten of those receivers were installed during the PaTrop campaign in order to expand the existing CRL network. The stations' locations were selected to cover the whole geographic extent of the study area, while capturing a variety of different topographical and meteorological conditions (i.e., coastal, inland, or mountainous terrain), at elevations between 0 and 1020 m above sea level (ASL). This was deemed necessary, in order to account mainly for water vapour variations resulting from orographic, coastal, and frontal gradients that could be present. Concurrently with the field campaign, the modelling setup and configuration was also implemented. The primary objective of PaTrop is to couple the zenithal tropospheric delay (ZTD) derived from GNSS data with the ZTDs derived from the output of a high-resolution meteorological model (WRF), in order to investigate the model's capability to reproduce the tropospheric conditions that contribute to the noise signal (in particular the highly variable water vapour distribution), and provide a benchmark of real observational data for validating the model output. The Weather Research and Forecasting (WRF) model [1] is a widely used open-source weather forecasting and re-analysis model that can be configured by the user according to the specific needs of each study. WRF has a proven record of producing high-resolution meteorological simulations down to a scale of hundreds of meters. It can be installed and operated in parallel mode using multi-processor computing resources, and therefore computational time can be greatly reduced for high-resolution simulations covering large geographical areas, such as PaTrop. Keeping the spatial resolution of the inner domain at 1 km, a series of re-analysis runs were produced to demonstrate the best possible configuration for our study, which is an approach supported theoretically and practically to tackle uncertainties in high-resolution modelling [25,26]. The parametric analysis was performed for a Concurrently with the field campaign, the modelling setup and configuration was also implemented. The primary objective of PaTrop is to couple the zenithal tropospheric delay (ZTD) derived from GNSS data with the ZTDs derived from the output of a highresolution meteorological model (WRF), in order to investigate the model's capability to reproduce the tropospheric conditions that contribute to the noise signal (in particular the highly variable water vapour distribution), and provide a benchmark of real observational data for validating the model output. The Weather Research and Forecasting (WRF) model [1] is a widely used open-source weather forecasting and re-analysis model that can be configured by the user according to the specific needs of each study. WRF has a proven record of producing high-resolution meteorological simulations down to a scale of hundreds of meters. It can be installed and operated in parallel mode using multi-processor computing resources, and therefore computational time can be greatly reduced for highresolution simulations covering large geographical areas, such as PaTrop. Keeping the spatial resolution of the inner domain at 1 km, a series of re-analysis runs were produced to demonstrate the best possible configuration for our study, which is an approach supported theoretically and practically to tackle uncertainties in high-resolution modelling [25,26]. The parametric analysis was performed for a two-week period (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) June 2016), during which the output of five different model configurations was tested against GNSS tropospheric measurements from the network of permanent stations in the study area. The optimum model configuration which resulted from the analysis was subsequently employed to provide the data for the main part of the PaTrop experiment. In order to establish the correlation between the observed and predicted time series, the following metrics are used, which give us a quantitative indication of the prediction skill of different model schemes with respect to the GNSS network dataset: where f i denotes the model value, o i the observational value, N is the number of pairs in the examined time series, σ f is the standard deviation of the model values and σ o the standard deviation of the observations.

Data Processing
Processing of the GNSS data started after the end of the experimental campaign and included calculations of zenith tropospheric delays (ZTDs) every 30 min using the JPL NASA Precise Point Positioning (PPP) GIPSY-OASIS 6.4 software. High-precision International GNSS Service (IGS) final orbit data were used, while the quality of the sites ensured that multipath effects would not bias the estimated ZTDs. Static tropospheric processing with no mapping function was used, and a priori ZHD was estimated from an elevation-dependent function. It was intended to use a simple set of parameters in order to compare the WRF model output with a geometrical GNSS calculation which does not assimilate data from external sources (i.e., weather models). The exact settings of the processing protocol used are listed in Table 1. With regards to the processing of WRF tropospheric data, values of surface pressure (P s ), as well as air temperature and water vapour pressure (T, p v ) in the vertical column (for each of the 45 vertical pressure levels) were derived every 30 min, at hh:00 and hh:30 h, at the nearest 1 km grid point from each GNSS station, to coincide with the observational time series. An automatic routine calculates the "dry" and "wet" delay terms separately, from the three parameters (P s , T, p v ), together with the point elevation and layer heights, finally adding up the two to calculate the ZTD value [27][28][29]: where P s is the total pressure (mbar) at the Earth's surface, and: accounts for the variation in gravitational acceleration with latitude λ and the height H of the surface above the ellipsoid (in km). And: where p v is the water vapour pressure (mbar), and T the air temperature (K), integrated along the zenith path z. WRF derived ZTDs were calculated at the exact elevation of the GNSS receiver, by vertically interpolating these parameters, thus minimising errors due to vertical height differences between the two datasets. Figure 2 illustrates graphically the GNSS-WRF geometry used in the calculation of the respective ZTD values.
where is the total pressure (mbar) at the Earth's surface, and: accounts for the variation in gravitational acceleration with latitude λ and the height H of the surface above the ellipsoid (in km). And: where is the water vapour pressure (mbar), and the air temperature (K), integrated along the zenith path z.
WRF derived ZTDs were calculated at the exact elevation of the GNSS receiver, by vertically interpolating these parameters, thus minimising errors due to vertical height differences between the two datasets. Figure 2 illustrates graphically the GNSS-WRF geometry used in the calculation of the respective ZTD values.

Model Configuration and Parameterization of Physical Components
For the high-resolution dynamical downscaling simulation performed with WRF v 3.7.1 over the PaTrop area of the western Gulf of Corinth, four nested domains were used (d01-d04), with a horizontal resolution of 27, 9, 3 and 1 km, respectively, as shown in Figure 1a; two-way nested, i.e., feedback from nest to its parent domain. The vertical layer distribution consists of 45 sigma levels up to a height of about 20 km (0.1 hPa), with bottom layers being more densely populated. Boundary conditions for the model initialization were taken from the ERA-Interim global climate re-analysis database, with a 75 km horizontal resolution, 35 vertical layers and 6 h temporal resolution. The model was initiated every day from the ERA-Interim input data at 18:00 local time, producing 30 h simulations with the first 6 h being spin-up time. Model output was recorded every 30 min, from which Zenith Hydrostatic Delays (ZHD) and Zenith Wet Delays (ZWD) were calculated as previously described. Land-use categories were taken from USGS 24-category data, which are available for different horizontal resolutions (10′, 5′, 2′, 30″). The initial land topography dataset used was the Global 30 Arc-Second Elevation Model

Model Configuration and Parameterization of Physical Components
For the high-resolution dynamical downscaling simulation performed with WRF v 3.7.1 over the PaTrop area of the western Gulf of Corinth, four nested domains were used (d01-d04), with a horizontal resolution of 27, 9, 3 and 1 km, respectively, as shown in Figure 1a; two-way nested, i.e., feedback from nest to its parent domain. The vertical layer distribution consists of 45 sigma levels up to a height of about 20 km (0.1 hPa), with bottom layers being more densely populated. Boundary conditions for the model initialization were taken from the ERA-Interim global climate re-analysis database, with a 75 km horizontal resolution, 35 vertical layers and 6 h temporal resolution. The model was initiated every day from the ERA-Interim input data at 18:00 local time, producing 30 h simulations with the first 6 h being spin-up time. Model output was recorded every 30 min, from which Zenith Hydrostatic Delays (ZHD) and Zenith Wet Delays (ZWD) were calculated as previously described. Land-use categories were taken from USGS 24-category data, which are available for different horizontal resolutions (10 , 5 , 2 , 30"). The initial land topography dataset used was the Global 30 Arc-Second Elevation Model (GTOPO30) provided by United States Geological Survey (USGS), with a 30" resolution for the smaller domain (d04), and coarser resolutions (10 , 5 , 2 ) for domains d01-d03, Remote Sens. 2021, 13, 1898 7 of 20 respectively. However, in order to test the impact of a more detailed topography on the re-analysis output, a high-resolution terrestrial dataset of d04 was later introduced (ASTER 1" global GDEM v2), with a horizontal grid of 30 m.
For the model physical and dynamical parameterization, five different schemes were tested, in order to evaluate each scheme for its forecasting skill. There have been numerous studies validating the output of different model configurations with observations under specific conditions [25,[30][31][32], showing that globally there is no optimal scheme, but rather different schemes produce better results with respect to application, domain, season, variable, etc. Therefore, a model parameterization test was performed for a two-week period (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) June 2016), during which the output of the five different model configurations was tested against GNSS tropospheric measurements from 16 permanent stations in the study area (d04).
The schemes were selected based on existing studies where similar high-resolution WRF simulations were used. All five schemes use the same parameterization for radiation physics (shortwave and longwave) and cumulus convection. In the first three schemes (MOD1, MOD2 and MOD3), cumulus convection is modelled in the 27 km domain according to the Kain-Fritsch scheme [33,34]. The cumulus scheme is not activated for the 9, 3 and 1 km domains, because at higher resolution the model can theoretically resolve convection explicitly [34]. In schemes MOD4 and MOD5, cumulus convection was turned on for d02 (9 km) in order to test the effect of cumulus parameterization in a smaller domain. Convection plays an important role for cloud formation and is controlled by micro-scale processes such as mixtures of updrafts and downdrafts. These simulated convective features are less distinguishable as model resolution becomes coarser; therefore, parameterization becomes necessary, although computationally demanding. Furthermore, in locations such as the Gulf of Corinth, where cloud formation is strongly influenced by the intense topography (land-sea contrasts and mountainous features), it is expected that cumulus parameterization in the 9 km domain will better represent the effects of sub-grid scale processes on the grid variables, particularly in the case of squall line formation, thunderstorms and other strong convection events [35].
Longwave radiation was simulated by the RRTM scheme [36] with the default diffusion scheme selected. The shortwave radiation was simulated by the Dudhia scheme [37]. Both are typical schemes for high-resolution WRF simulations. With respect to microphysics, land surface and planetary boundary layer options, the model was originally configured with the basic options for a less computationally demanding ERA-Interim dynamical downscaling at 1 km horizontal resolution. In the second scheme (MOD2), the microphysics scheme was changed to Morrison double-moment [38], in order to investigate the effect of a more complex model which uses double-moment ice, snow, rain and graupel for cloud-resolving simulations [25,39]. In the third scheme (MOD3), the same configuration as MOD2 was used, with a change in land surface parameters, in which the Pleim-Xiu land surface model and Pleim-Xiu surface layer physics was used [40,41]. In the fourth scheme (MOD4), the same configuration as MOD3 was used, this time with the cumulus scheme activated in the second domain (9 km), as explained above. Finally, the fifth scheme (MOD5) used the parameterization of Zhang et al. [42], as it refers to a study with similar dynamical downscaling characteristics (input dataset and horizontal resolution), performed over a complex terrain with land-sea contrasts (Hong Kong island). The parameters changed compared with the previous schemes were microphysics (SBU-YLin model was used, a sophisticated scheme that has ice, snow and graupel processes, suitable for real-data high-resolution simulations) [43], PBL scheme (YSU used in conjunction with SBU-YLin), and surface layer scheme (MM5 similarity). The NOAH model was used for land surface. Again, cumulus convection is activated in the second domain (9 km). Table 2 lists the parameterization used in each scheme.

Parametric Analysis and Evaluation of WRF Schemes with GNSS Data
The parametric analysis was performed for a two-week period (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) June 2016), with the aim to test the forecasting skill of different WRF parameterization schemes. Tropospheric data (i.e., ZTD values) from the CRL GNSS network provided the benchmark for the analysis, for comparison with ZTDs derived from WRF at the nearest grid point. The selection of the test period for the parametric analysis was based on weather conditions (i.e., high temperatures and the occurrence of a convective storm event during the last days of June), which theoretically would produce a high degree of ZTD variability.

Parametric Analysis and Evaluation of WRF Schemes with GNSS Data
The parametric analysis was performed for a two-week period (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) June 2016), with the aim to test the forecasting skill of different WRF parameterization schemes. Tropospheric data (i.e., ZTD values) from the CRL GNSS network provided the benchmark for the analysis, for comparison with ZTDs derived from WRF at the nearest grid point. The selection of the test period for the parametric analysis was based on weather conditions (i.e., high temperatures and the occurrence of a convective storm event during the last days of June), which theoretically would produce a high degree of ZTD variabil-   Results at most stations exhibit a strong correlation between WRF derived ZTDs and GNSS values, with the exception of the last two days of the experiment (28-29/6) where the model shows significant deviation from the observed data. All schemes seem to follow the overall trend (including the storm event of 25-26 June). The Correlation Coefficient R at the 16 locations, for the test period, ranges from 0.57 at LIDO to 0.85 at PAT0, both with the MOD5 scheme (Table 3). Overall, MOD3, MOD4 and MOD5 exhibit the strongest correlation, with MOD5 having the highest average R score (0.74).  Results at most stations exhibit a strong correlation between WRF derived ZTDs and GNSS values, with the exception of the last two days of the experiment (28-29/6) where the model shows significant deviation from the observed data. All schemes seem to follow the overall trend (including the storm event of 25-26 June). The Correlation Coefficient R at the 16 locations, for the test period, ranges from 0.57 at LIDO to 0.85 at PAT0, both with the MOD5 scheme (Table 3). Overall, MOD3, MOD4 and MOD5 exhibit the strongest correlation, with MOD5 having the highest average R score (0.74). The model in general seems to slightly underestimate the observed data, especially for high ZTD values. Mean bias (MB) values indicate that, on average, this offset ranges from −14.9 mm (MOD1) to −11.6 mm (MOD5). Mean absolute bias (MAB) is a measure of the average absolute error between the two time series. Mean absolute bias values at the 16 locations range from about 15 mm at KALA to 26 mm at ANOC. Schemes with more sophisticated physical parameterization, better suited for high-resolution re-analysis (MOD4 and MOD5) exhibit the lowest overall MAB values (19.5 mm). Similarly, root mean square error (RMSE) values range from 21 mm at KALA to 31 mm at ANOC. Figure 5 illustrates the RMSE spread between the 16 points, for each model configuration. It is shown that schemes with a more complex physical parameterization (MOD4 and MOD5) exhibit lower RMSE values, as well as narrower spread than initial WRF configurations.
Results also indicate that the RMSE distribution is more homogeneous among coastal and inland stations (blue and green colours), than stations in mountainous locations (orange and red). This finding could be related to the specific meteorological conditions which characterize regions with intense topography, as well as the fact that input data (e.g., measurements from surface stations) injected into the model as initial conditions, may be sparse in certain remote locations, leading to a poorer model prediction skill in these areas. Results also indicate that the RMSE distribution is more homogeneous among coastal and inland stations (blue and green colours), than stations in mountainous locations (orange and red). This finding could be related to the specific meteorological conditions which characterize regions with intense topography, as well as the fact that input data (e.g., measurements from surface stations) injected into the model as initial conditions, may be sparse in certain remote locations, leading to a poorer model prediction skill in these areas. A graphical summary of validation metrics for MOD5 is found in Figure 6. We observe that although R is decreasing with increasing station elevation (as previously discussed), MAB and RMSE are not significantly increasing, which is a result of lower absolute ZTD values at higher altitudes where the tropospheric layer is thinner. A graphical summary of validation metrics for MOD5 is found in Figure 6. We observe that although R is decreasing with increasing station elevation (as previously discussed), MAB and RMSE are not significantly increasing, which is a result of lower absolute ZTD values at higher altitudes where the tropospheric layer is thinner. Results also indicate that the RMSE distribution is more homogeneous among coastal and inland stations (blue and green colours), than stations in mountainous locations (orange and red). This finding could be related to the specific meteorological conditions which characterize regions with intense topography, as well as the fact that input data (e.g., measurements from surface stations) injected into the model as initial conditions, may be sparse in certain remote locations, leading to a poorer model prediction skill in these areas. A graphical summary of validation metrics for MOD5 is found in Figure 6. We observe that although R is decreasing with increasing station elevation (as previously discussed), MAB and RMSE are not significantly increasing, which is a result of lower absolute ZTD values at higher altitudes where the tropospheric layer is thinner. Under turbulent and humid atmospheric conditions (e.g., afternoon of 25 June), different schemes behave in different manners. Two separate groups can be distinguished (MOD1, MOD2 and MOD3 vs. MOD4 and MOD5), with the second group exhibiting a better prediction skill, possibly as a result of the Kain-Fritsch cumulus convection scheme being turned on at the 9 km domain. A significant deviation event can also be distinguished during the second half of 28 June, where the model produces a sudden "drop" in all stations (of the order of 60-120 mm) in relation to the observed ZTDs.

Validation of WRF Derived Tropospheric Delay Maps with GNSS ZTD Measurements for the PaTrop Period (January-December 2016)
Following the initial configuration of the WRF model and parameterization based on the short-scale analysis, as described in the previous section, the main part of the PaTrop experiment extends into a whole year of validation of model re-analysis output with the use of observational tropospheric data from the CRL GNSS network in the Western Gulf of Corinth. The WRF scheme which provided the best simulation results based on the parametric test was selected (MOD5), having a more complex physical parameterization, and hence being better suited for high-resolution re-analysis simulations. This includes: the SBU-YLin microphysics model, with a more sophisticated scheme for ice, snow, rain and graupel processes in the lower troposphere; the NOAH land surface model; the MM5 similarity scheme for surface layer physics; and the YSU planetary boundary layer model. In addition, MOD5 uses a cumulus convection scheme in both the 27 km and 9 km domains, thus simulating processes such as convective fluxes and the associated evaporation or condensation of water more coherently over a complex terrain with land-sea contrasts. The MOD5 output, represented as Zenith Total Delay (ZTD) values calculated from specific atmospheric parameters (surface pressure, air temperature and water vapour profiles) over the entire 1 × 1 km grid is validated against a dataset of GNSS derived ZTD values, providing point measurements at the 19 points where the stations are located.
Macroscopically, the two ZTD datasets are closely correlated in all 19 locations (Figure 7), with values peaking during the warm months (July-September) and subduing during the cold period (December-March), as expected (ZTD is proportional to surface pressure and air temperature which both increase during summer). It is also visible from the oscillating pattern of the time series that in absolute terms the tropospheric delay signal shows a higher degree of variability during the second half of the year than during the first half, as a result of a combination of more intense temperature and water vapour fluctuations. Bias plots of predicted (WRF) minus observed (GNSS) ZTDs (Figure 8) indicate a slightly stronger bias during the warm period. For a more detailed analysis of the correlation between the two datasets, the same statistical indices are used as in the physical parameterization study (R, MB, MAB, RMSE), which give us a quantitative indication of the accuracy and variability of the model prediction with respect to the dataset of the GNSS monitoring network. A summary of the regression statistics for the whole period of study is presented in        , and indicate that the model tends to slightly underestimate the tropospheric ZTD as compared to the GNSS derived values. This finding is in-line with similar WRF evaluation studies [27,33] reporting consistently negative differences in relative humidity (a primary physical parameter in calculating the ZTD) with respect to ground observations in highresolution WRF re-analysis scenarios, which are attributed to differences in vertical mixing strength and entrainment. • Mean absolute bias (MAB) and root mean square error (RMSE) are both a measure of the absolute error between the two time series and are particularly useful, as the correction of the tropospheric component in InSAR interferograms is dependent on the model's capability to produce high-resolution differential meteograms of tropospheric delay with the minimum absolute error (of the order of magnitude of one interferometric phase cycle π). Mean absolute bias values at the 19 locations range from 14.9 mm (KALA station) to 29.0 mm (PSAT station), with RMSE values covering a similar range.
A further look into the model validation metrics for the entire PaTrop period reveals some additional trends: • As shown in Figure 9a, the RMSE seems to be independent of the horizontal distance s between the GNSS station and the nearest WRF grid point where the calculation of the predicted ZTD is performed. Therefore, we can conclude that the horizontal resolution of 1 km used for the WRF simulation is adequate.

•
With respect to station elevation h, a small reduction of MB in terms of its absolute value is evident with increasing h, as expected, due to smaller ZTD values (Figure 9b).
Out of the 19 stations, the three highest mean negative biases are in XILI, PSAT and PAT0 (elevations ASL 4, 19 and 91 m), while the two lowest are in LIDO and KALA (550 and 716 m). The graphical summary of validation metrics for the entire period ( Figure 10) also reveals that while R is fairly constant with increasing station elevation, MAB and RMSE exhibit a reduction.

Seasonal Characteristics of WRF vs. GNSS ZTD and Evaluation of Model Performance
Beyond the annual time series analysis and evaluation, it is useful to perform a more in-depth investigation of seasonal trends, which can provide an important insight into the model's forecast skill under different meteorological conditions. The year is therefore divided into four seasons (S1-S4) based on their distinct climatological characteristics as follows: Winter S1: January-March. Spring S2: April-June. Summer S3: July-September. Autumn S4: October-December. Table 5 lists the corresponding model validation metrics per station and for each season. In general, model performance is better during autumn (S4), as correlation with the observed ZTD time series is high at all stations (R = 0.93 with range 0.91-0.94) and MAB and RMSE are lower (average MAB = 20.5; RMSE = 24.6). In spring (S2), WRF reproduces 88% of the ZTD variability (R = 0.88 with range 0.85-0.89) and exhibits a slightly higher bias than autumn (average MAB = 21.1; RMSE = 25.7). In winter (S1), the model reproduces 85% of the ZTD variability; however, with a wider correlation range between stations (R = 0.85 with range 0.75-0.88), and bias is similar to S2 (average MAB = 21.2;

Seasonal Characteristics of WRF vs. GNSS ZTD and Evaluation of Model Performance
Beyond the annual time series analysis and evaluation, it is useful to perform a more in-depth investigation of seasonal trends, which can provide an important insight into the model's forecast skill under different meteorological conditions. The year is therefore divided into four seasons (S1-S4) based on their distinct climatological characteristics as follows: Winter S1: January-March. Spring S2: April-June. Summer S3: July-September. Autumn S4: October-December. Table 5 lists the corresponding model validation metrics per station and for each season. In general, model performance is better during autumn (S4), as correlation with the observed ZTD time series is high at all stations (R = 0.93 with range 0.91-0.94) and MAB and RMSE are lower (average MAB = 20.5; RMSE = 24.6). In spring (S2), WRF reproduces 88% of the ZTD variability (R = 0.88 with range 0.85-0.89) and exhibits a slightly higher bias than autumn (average MAB = 21.1; RMSE = 25.7). In winter (S1), the model reproduces 85% of the ZTD variability; however, with a wider correlation range between stations (R = 0.85 with range 0.75-0.88), and bias is similar to S2 (average MAB = 21.2;

Seasonal Characteristics of WRF vs. GNSS ZTD and Evaluation of Model Performance
Beyond the annual time series analysis and evaluation, it is useful to perform a more in-depth investigation of seasonal trends, which can provide an important insight into the model's forecast skill under different meteorological conditions. The year is therefore divided into four seasons (S1-S4) based on their distinct climatological characteristics as follows: Winter S1: January-March. Spring S2: April-June. Summer S3: July-September. Autumn S4: October-December. Table 5 lists the corresponding model validation metrics per station and for each season. In general, model performance is better during autumn (S4), as correlation with the observed ZTD time series is high at all stations (R = 0.93 with range 0.91-0.94) and MAB and RMSE are lower (average MAB = 20.5; RMSE = 24.6). In spring (S2), WRF reproduces 88% of the ZTD variability (R = 0.88 with range 0.85-0.89) and exhibits a slightly higher bias than autumn (average MAB = 21.1; RMSE = 25.7). In winter (S1), the model reproduces 85% of the ZTD variability; however, with a wider correlation range between stations (R = 0.85 with range 0.75-0.88), and bias is similar to S2 (average MAB = 21.2; RMSE = 25.2). Model forecasting skill deteriorates during summer (S3), where correlation is weaker (average R = 0.83) and bias indicators are higher (MAB = 22.8, RMSE = 28.0). These results are in-line with seasonal climate characteristics in the region of Western Greece, as stable weather conditions and frontal precipitation patterns which prevail during autumn are well simulated by the model, whereas the combination of high temperatures and turbulent conditions during some summer days, resulting in convective storms are more difficult to predict. A graphical illustration of the RMSE variability for each season, at the location of each station, is seen in Figure 11.
When we look at the monthly variability of the mean absolute bias (Figure 12), a distinct "peak" in February and March is evident in most coastal and inland stations, possibly due to the winter storms which are common during this period. A second "peak" during the summer period (June-September) is explained by a combination of high temperatures and the occurrence of intense convective storm events, as discussed earlier. Stations which are located inland exhibit a more uniform monthly variability of the MAB (co-variance is high), while coastal stations exhibit a more diverse variability, particularly during the "hot" season S3. In stations located upland, two separate groups are distinct: the first one including the Peloponnese stations (south) plus LIDO exhibit a smaller degree of seasonal MAB variability, while ANOC and MESA (both located in the northern Aitoloakarnania region) exhibit higher seasonal variability (distinct "peaks" during S1 and S3 and higher MAB). We can further investigate local inhomogeneities between stations by calculating the correlation coefficient R of their difference, i.e., between the GNSS ∆ZTD and WRF ∆ZTD of stations MESA and KALA, for seasons S1-S4. Thus, the annual and semi-annual component of ZTD variability is removed from the time series and the ability of the model to resolve small-scale inhomogeneities is demonstrated. Results indicate that model performance is better during the second half of the year, as WRF reproduces 54% of the observed ∆ZTD variability between the two stations in summer (S3) and 52% in autumn (S4) (R is 0.54 and 0.52 respectively). The correlation is slightly weaker during the first half, with R = 0.48 in winter (S1) and R = 0.49 in spring (S2). peratures and turbulent conditions during some summer days, resulting in convective storms are more difficult to predict. A graphical illustration of the RMSE variability for each season, at the location of each station, is seen in Figure 11.  When we look at the monthly variability of the mean absolute bias (Figure 12), a distinct "peak" in February and March is evident in most coastal and inland stations, possibly due to the winter storms which are common during this period. A second "peak" during the summer period (June-September) is explained by a combination of high temperatures and the occurrence of intense convective storm events, as discussed earlier. Stations which are located inland exhibit a more uniform monthly variability of the MAB (co-variance is high), while coastal stations exhibit a more diverse variability, particularly during the "hot" season S3. In stations located upland, two separate groups are distinct: the first one including the Peloponnese stations (south) plus LIDO exhibit a When we look at the monthly variability of the mean absolute bias (Figure 12), a distinct "peak" in February and March is evident in most coastal and inland stations, possibly due to the winter storms which are common during this period. A second "peak" during the summer period (June-September) is explained by a combination of high temperatures and the occurrence of intense convective storm events, as discussed earlier. Stations which are located inland exhibit a more uniform monthly variability of the MAB (co-variance is high), while coastal stations exhibit a more diverse variability, particularly during the "hot" season S3. In stations located upland, two separate groups are distinct: the first one including the Peloponnese stations (south) plus LIDO exhibit a smaller degree of seasonal MAB variability, while ANOC and MESA (both located in the northern Aitoloakarnania region) exhibit higher seasonal variability (distinct "peaks" during S1 and S3 and higher MAB). We can further investigate local inhomogeneities between stations by calculating the correlation coefficient R of their difference, i.e., between the GNSS ΔZTD and WRF ΔZTD of stations MESA and KALA, for seasons S1-S4. Thus, the annual and semi-annual component of ZTD variability is removed from the time series and the ability of the model to resolve small-scale inhomogeneities is demonstrated. Results indicate that model performance is better during the second half of the year, as WRF reproduces 54% of the observed ΔZTD variability between the two stations in summer (S3) and 52% in autumn (S4) (R is 0.54 and 0.52 respectively). The correlation is slightly weaker during the first half, with R = 0.48 in winter (S1) and R = 0.49 in spring (S2).
The overall model performance with respect to estimating the ZTD parameter for the successful tropospheric correction of InSAR interferograms is evaluated by setting the acceptable bias range as the amplitude of one Sentinel-1 C-band cycle phase π (equal to λ/2, where λ is the SAR signal wavelength), i.e., ±23 mm of tropospheric delay, when projected to the zenithal distance (considering a mean incidence angle of 35 • ). This allows us to determine the percentage of σ lying within the accepted range per season (Table 6), which in theory represents the probability that the model successfully detects and removes the tropospheric delay.

Discussion
Results of the physical parameterization analysis indicate that WRF configurations MOD4 and MOD5 overall exhibit a better prediction skill during the test period, with small differences between them. Turning on the Kain-Fritsch cumulus convection scheme on the 9 km domain improves model output, particularly during intense frontal events, as demonstrated by validation metrics of MOD4 in comparison with MOD3 (all physical parameters are the same except the cumulus scheme). Moreover, the use of more complex microphysics schemes (such as Morrison and SBU-YLin), surface layer (Pleim-Xiu and MM5) and planetary boundary layer (ACM2 and YSU) models, although computationally demanding, further enhances the model prediction skill. Indeed, model setup and physical parameterization can affect the intrinsic water balance [44], resulting in a more accurate representation of precipitation and relative humidity fields [25,45]. Based on the results of the parametric analysis, MOD5 is selected as the optimum configuration for producing tropospheric delay maps for the entire PaTrop period (January-December 2016), having a slightly higher R, lower MB and better clustering of RMSE than MOD4.
On an annual basis, the statistical analysis of the results demonstrated that the correlation between predicted and observed ZTDs at the 19 stations is good throughout the year (correlation co-efficient R ranges from 0.91 to 0.93), with mean bias (MB) ranging from −11.1 mm (KALA station) to −28.2 mm (PSAT station), indicating that the model tends to slightly underestimate the tropospheric ZTD as compared to the GNSS derived values. Results are in-line with previous WRF high-resolution studies [25,46,47], which report consistent underestimation of RH at the ground level (ranging from 9 to 15%). With respect to the seasonal component, model performance is better during the autumn period (October-December), followed by the spring period (April-June). Correlation with the observed ZTD time series is high at all stations (correlation co-efficient R is 0.93 and 0.88, respectively) and MAB and RMSE values are low. On the contrary, model forecasting skill seems to deteriorate during summer (July-September), where correlation is weaker (R = 0.83) and MAB and RMSE values are higher (average MAB = 22.8, RMSE = 28.0). Again, results re-confirm research findings from similar comparisons of GNSS tropospheric products and LAMs in Europe [48][49][50], reporting a pronounced seasonal cycle of GNSS-NWP model bias and standard deviation with the largest errors in summer and smallest in winter. Setting the acceptable bias range at ±23 mm (equal to the amplitude of one Sentinel−1 C-band phase cycle when projected to the zenithal distance), it is demonstrated that the model produces satisfactory results, with a percentage of ZTD values within the bias margin ranging from 57% in summer to 63% in autumn.

Conclusions
The experimental approach followed aims to maximise the potential of using LAMs for the calculation and removal of the tropospheric noise from InSAR data. Five different WRF parameterization schemes were tested, ranging from schemes with relatively simple physical and dynamical parameterization to more complex schemes which require longer computational times. A parametric analysis was performed for a two-week period (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) June 2016), during which the output of different model configurations, in terms of ZTD, was compared with GNSS tropospheric measurements from 16 permanent stations in the study area (d04). Results were tested for their statistical significance and demonstrated that the optimum WRF configuration to be used for the entire period of the PaTrop experiment was MOD5. Consequently, daily runs of the optimal scheme were performed for a one-year period (January-December 2016), and model output was again validated with the use of the observational GNSS dataset from the CRL network in the Western Gulf of Corinth.
Overall, the current study confirms the high potential and effectiveness of using high-resolution atmospheric modelling for calculating the ZTD component, and thus correcting atmospheric phase gradients in interferograms. The innovative elements that our methodology brings in comparison with previous studies are: (a) High-resolution re-analysis with WRF: our model is downscaled at 1 km horizontal resolution and therefore offers a detailed reconstruction of the 3-D tropospheric field over the study area. The model was locally configured and parameterized in the area of the western Gulf of Corinth, and numerous complex schemes were tested in order to demonstrate the optimal configuration at the specific location; (b) Validation of the model: WRF output was validated with the use of GNSS tropospheric data retrieved from a dense array of stations covering the study area. Model validation with vertical column data (GNSS zenithal delays) instead of ground measurements offers the capability of evaluating the model's forecasting skill over the entire 3-D field, thus enabling fine-tuning of its physical parameterization with the use of a high-accuracy representative observational dataset. This is particularly useful when it comes to estimating the highly variable water vapour signals which are exhibited in the differential atmosphere as densely distributed short-wavelength phase gradients.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to extensive size.