1. Introduction
Water vapor is a key greenhouse gas and an indispensable component of the water cycle. Although it accounts for only 0.1% to 3% of the atmosphere, it is one of the most active atmospheric components [
1]. It directly affects the vertical stability of the atmosphere and the formation or evolution of weather systems and contributes to radiation balance and a series of weather phenomena, such as cloud formation, rainfall, or snowfall events, by absorbing or releasing massive amounts of latent heat during phase transition [
2]. Water vapor exhibits a complicated spatio-temporal distribution [
3]. The accurate detection of the distribution and variation of atmospheric water vapor content can provide the data necessary to understand weather processes for weather forecasting and meteorology research [
3]. Precipitable water vapor (PWV), the water vapor content of a vertically integrated column per unit area, is a direct indicator of atmospheric water vapor content and is expressed as the height of the corresponding equivalent liquid water column in centimeters [
4].
The meteorological application of GNSS in the remote sensing of atmospheric water vapor has been a research hotspot since the early 1990s given the rapid development of GNSS and the extensive construction of continuous operation reference stations (CORS). The principle of the GNSS-based PWV retrieval technique was initially proposed by Bevis et al. [
3] in 1992. This technique assumes that the wet component of atmospheric delays is proportional to PWV. It has attracted considerable attention since it was first reported because it can provide abundant data with high spatio-temporal resolution and is suitable for all weather conditions [
5]. Many scholars have compared this method with traditional methods to verify its reliability. Differences among PWV data for North America [
6], Europe [
7,
8,
9], and Asia [
10,
11,
12] retrieved from GNSS, radiosonde, and WVR observations were less than 5 mm. This result indicates that the accuracy of GNSS-derived PWV is comparable with that of traditional techniques.
In recent years, many studies [
13] have concentrated on estimates of weighted mean temperature (
Tm), which is a key parameter in the conversion of zenith wet delay to PWV in GNSS meteorology. Actually, re-analysis products can provide sufficiently accurate
Tm data theoretically. However, the products have a problem with a time-delay release, cannot meet the real-time demand for
Tm data [
13]. Therefore,
Tm model has been an indispensable part of GNSS meteorology due to it can be used to calculate
Tm values in real time. The current
Tm model is mainly divided into two categories, according to its differences in modeling principles. The first category can be called global model, usually modeled on the basis of
Tm spatiotemporal variations. These models with complex input parameters such as the geographic coordinates of the points to be computed and UTC time, but they can calculate the accurate
Tm values [
14,
15]. However, when determining
Tm from the global models, no meteorological data are needed and only the aforementioned parameters are required as input for these models. This reveals that the global models are very useful for those stations without surface meteorological sensors. The second category is the regional model, which is modeled on the basis of the relationship between
Tm and surface meteorological elements, is a linear model such as Bevis model [
3] and Liu model [
16]. Besides, the previous studies pointed out that the relationship between
Tm and surface meteorological elements is not constant, instead, it varies with location and time [
17,
18]. This indicates that the empirical regional model based on local meteorological data will be more accurate for the regional application. Therefore, in this work, we developed a multifactorial regional
Tm model using the datasets of radiosonde and surface meteorological data in Hong Kong, which was to meet the needs of the process of obtaining GNSS-derived PWV in Hong Kong.
The latest studies (e.g., [
19,
20]) have shown that GNSS-derived PWV has considerable potential applications in precipitation forecasting or meteorological disaster warning. Furthermore, the analysis of water vapor changing trends or the forecasting of short-term precipitation events requires the transformation of one-dimensional PWV data from observation networks to two-dimensional data through spatial interpolation [
21]. The spatial interpolation algorithm consists of showing the dynamic distribution for water vapor within the CORS coverage area by real-time processing the GNSS-derived PWV of each receiver in the CORS network. Li et al. [
20] developed a real-time monitoring and analytical system for the dynamic spatial and temporal variation characteristics of water vapor. Their system can track the dynamic variation in water vapor content and forecast small- and medium-scale extreme weather. However, there is limits the explanation of the PWV interpolation method and interpolation accuracy. In fact, few studies have focused on the reliability of the interpolation methods for PWV data. But, quite a few studies have confirmed that the Kriging interpolation method has good performance for meteorological variables, such as temperature [
22], rainfall [
23], and wind [
24]. Meanwhile, high-density sample points with attributes like PWV as the input of Kriging interpolation would contribute to generating a more accurate continuous surface [
25,
26]. Realini et al. [
27] found that even the densest GNSS networks experienced difficulties in providing data with the high spatial resolution for the detection of local fluctuations in water vapor. Because of economic reasons, it is unfeasible to further improve the density of GNSS networks. Hence this paper proposed an alternative method for increasing the sample density with inexpensive digital elevation model (DEM), which does not require equipment installation and maintenance costs. The DEM points with the three-dimensional coordinates were used to construct more sample points (i.e., virtual sample points) by multifactorial fitting equations. Moreover, the parameters of the multifactorial fitting equations are obtained in terms of the strong correlation between GNSS-derived PWV and the geographic coordinates as well as elevations of the original sample observations. In this work, the GNSS-derived PWV from the Hong Kong region was interpolated through two schemes: Scheme I only applying the PWV data derived from original stations in Hong Kong GNSS network; and Scheme II that combines the PWV data in Scheme I with that in the virtual observations. Prediction performances of the two schemes are evaluated using cross validation. Furthermore, the rationality of the GNSS-derived PWV maps obtained by the two schemes is assessed by the DEM maps and cumulative precipitation maps in the study area.
This paper is organized as follows: the study area, permanent GNSS network, and radiosonde station are described in
Section 2. Methods for retrieval of precipitable water vapor, acquisition of a regional
Tm model, DEM point sampling, the establishment of virtual sample points, and spatial interpolation of PWV are discussed in
Section 3.
Section 4 presents test results of the proposed regional
Tm model and GNSS-derived PWV spatial interpolation. A summary and discussion are given in
Section 5.
3. Materials and Methods
The PWV retrieval processing flow is shown in
Figure 2. The software obtains the zenith total delay (ZTD) after processing GNSS observations and meteorological data. The empirical model is used to calculate the zenith hydrostatic delay (ZHD) on the basis of the measured or interpolated meteorological parameters. The conversion coefficient II is obtained by inputting the surface meteorological parameters in the regional model for the estimation of the weighted mean temperature (
Tm), which is used for a series of calculations. Then, the PWV is calculated by multiplying conversion coefficient II with zenith wet delay (ZWD). The details of the above steps are described in the following sections.
The proposed improved interpolation steps are shown in
Figure 3. A series of points with three-dimensional coordinates, which are selected through the systematic sampling approach from the point cloud data of the DEM, establish links with original sample points to yield the virtual points with PWV values on the basis of the correlation between PWV values and geographical location. The original sample point and the virtual sample are adopted in the spatial interpolation method applied to construct the GNSS-derived PWV map. Finally, the qualities (i.e., the rationality and prediction performance) of interpolation is evaluated by actual precipitation map and prediction error obtained by cross-validation. The following sections describe these processes in detail.
3.1. Method for Water Vapor Retrieval
The ground-based GNSS meteorology system is based on the GNSS observation network. The remote sensing of tropospheric water vapor involves the process of separating ZHD from ZTD and multiplying the obtained ZWD with conversion coefficient II to finally obtain the PWV value. These steps are illustrated by Equations (1)–(3):
where
ρw is the density of liquid water (
) units not italic, and
Rv represents the universal gas constant for water vapor (
);
,
k3 are atmospheric physical constants
);
Tm indicates the weighted mean temperature in K [
3].
ZHD in Equation (1) is calculated by using the Saastamoninen empirical model [
28]:
where
P is surface pressure (hPa).
ϕ (rad) and
H (km) represent the latitude and geodetic height of the station, respectively.
P can be obtained from rinex meteorological data files that were released by the station. The value of
P for stations that lack meteorological data can be calculated through interpolation with the global pressure and temperature model. Meanwhile,
ϕ and
H can be obtained from public receiver information.
ZTD in Equation (1) was estimated by using GAMIT software (version 10.6) [
29] which is based on the double-difference model. The close distances between every two stations resulted in short baselines that subsequently strengthened the correlation between tropospheric parameters. To weaken this correlation, the observation data of four international GNSS service (IGS) stations (BJFS, SHAO, URUM, and LHAZ) were incorporated into the baseline solution [
30,
31]. In this work, and the remainder of the solution strategy for ZTD estimation is summarized in
Table 2.
Interval Zen/Number Zen = 1/25 shows that ZTD parameters are estimated per hour. In this study, the downloaded VMF1 model (
ftp://everest.mit.edu//pub/GRIDS/) is adopted as the mapping function.
3.2. Method for Obtaining Tm
As shown in Equations (2) and (3),
Tm is a vital variable to determine the conversion in the process of ZWD to PWV [
3], and it can be calculated with the following equation:
where
e and
T are the water vapor pressure and temperature of each layer of the atmosphere, respectively.
In fact, a continuous dataset (e.g., temperature) can hardly be acquired in practice, the numerical integration expressed by Equation (5) is often approximated as:
where
hi is the thickness of
ith layer atmosphere, and
ei and
Ti are the water vapor pressure and temperature of the
ith layer atmosphere, respectively.
A regional
Tm model for HK can be built on the basis of previous studies [
16]. The relevant method and steps for developing a high accuracy
Tm model are described in detail in
Section 4.
3.3. Method for DEM Point Sampling
DEM is a continuous surface that represents ground elevation in the form of a series of ordered numerical arrays [
32]. DEM data with a 3 arc-second resolution provided by the NASA Shuttle Radar Topographic Mission (SRTM, version 3) was adopted in this study. The vertical resolution of the DEM data was 90 m. The data used in this study are available on the website of the U.S. Geological Survey (
https://dds.cr.usgs.gov/srtm/version2_1/SRTM3/). QGIS software (version: 2.8) [
33] was used to clip out a rectangular area wherein HK was located at 113.80°E–114.42°E and 22.15°N–22.56°N. Then, the software was used to transform the clipped data into the xyz grid data format, which is a three-column matrix that comprises original geographic coordinates of DEM data. In this case, x, y, and z represent longitude, latitude, and ellipsoidal height, respectively. Next, these data were reorganized for display, and the SRTM DEM map of the study area is shown in
Figure 4.
Massive amounts of DEM points result in reducing the efficiency of the interpolation algorithm. However, a series of points with spatial distribution determined by systematic sampling can be sufficient to illustrate the spatial variability of the research object. Thus, the approach only needs to extract a certain number of points by setting the appropriate longitude and latitude intervals. Besides, only the PWV data in the land surface needs to be generated in view of actual interpolation requirements. In this work, DEM samples were filtered on the basis of the following principles:
- (1)
Setting the longitude interval to 0.045°, a series of points were determined.
- (2)
The latitude interval was set to 0.035°, and all points within the scope of the study were determined.
- (3)
All points in the land area were retained and some points from the oceans were removed on the basis of the points identified above.
After the above three steps, 98 points with three-dimensional coordinates were specified, as shown in
Figure 5.
3.4. Method for the Establishment of Virtual Sample Points
Only 17 stations have PWV information (as shown in
Figure 1) that can be interpolated for the construction of water vapor distribution maps for HK. Moreover, the spatial distribution of these stations was insufficiently dense and uniform and was too coarse to reflect the dynamic fluctuations of water vapor. Thus, PWV information from existing original sample points to virtual sample points with a uniform distribution (
Figure 5) must be extended through a reasonable method on the basis of the relationship between PWV values and the geographical position of 17 original points. The PWV values calculated with the GNSS observation data and meteorological data of 17 stations for the period of 0:00 on August 19, 2017 to 0:00 on September 1, 2017 (the sample interval was half an hour and provided 637 results in total) were compared with their elevation information (referred to ellipsoidal height in this paper) to explore the correlation between PWV and elevation.
Figure 6 shows the number and percentage of points in the range of correlation coefficients (CCs). The CCs of 70.96% of the sample points exceeded 0.7, those of the remaining 17.42% were in the range of 0.5 to 0.7, and those of 11.62% of the sample points were less than 0.5. These results indicate that the PWV values and station elevation of more than 88% samples were significantly correlated. Therefore, extending PWV information to virtual sample points based on the correlation between station elevation (or horizontal position information) and PWV was reliable. The steps involved in extending PWV information are as follows:
- (1)
The CCs between station horizontal position (x, y, x2, y2, and xy)/elevation (h and h2) information and its PWV value were determined. Where x, y, and h represent longitude, latitude, and elevation, respectively.
- (2)
Station position parameters (e.g. x, y, and h) with CCs of less than 0.7 were deleted.
- (3)
The PWV expanding functional model based on the linear or nonlinear relations between PWV value and station spatial position information was constructed, and parameters were screened through the stepwise regression method:
where
and
are coefficient column vectors and constant row vectors of variables
, respectively. Eventually, the optimal multiple regression equation is deduced by the stepwise method at the significance level of 0.05.
3.5. Method for Spatial Interpolation
Spatial interpolation is a common method used to obtain the information at a position within an unmeasured area and is based on the application of known information from surrounding stations [
34], which are known as sample points. An interpolated value is also called predicted value. Some techniques, such as inverse distance weighted interpolation, Kriging, natural neighbor, and two-dimensional minimum curvature spline, are often used in the spatial interpolation. The Kriging technique is a geostatistical (rather than nondeterministic) approach that generates a continuous surface that does not pass through all sample points. The prediction provided by the Kriging technique is an unbiased estimate of the true value with the minimum variance. It has been used in a wide range of fields for years, including ecology, hydrology, meteorology, and geomatics [
35,
36,
37,
38]. Based on the existing research results [
34,
35,
36,
37,
38] and the high correlation between PWV and elevation, in this study the co-Kriging (CK) method is adopted for spatial interpolation of PWV. In the case of CK, the predictions for points is defined as the following linear weighted model:
where Z(
s0) is the predicted value at the location
s0;
represent the values of the main variable and subvariable at locations
i and
j, respectively;
n and
m are the sample sizes of
z and
x, respectively; and
λi and
bj are the CK weights, which depends on the spatial relationship between the values at the estimation point and the sample point.
λi and
bj are obtained using the Lagrange multiplier as follows:
where
is the value of the variogram between
and
. Similarly,
is the value of the variogram between
and
, et al. The dissimilarity between two sample points can be measured using a variogram, which is a function of the distance and direction of the two points. The values of
and
are obtained by solving linear Equation (10) and then substituted into Equation (10) to interpolate CK at each point.
The prediction accuracy can be validated using the differences between the predicted values and the measured values at those sample points, because the latter can be assumed as the truth. In addition, the prediction performance of Kriging interpolation was evaluated by ‘one out’ cross-validation. The idea consists of removing temporarily one datum at a time from the data set and ‘re-predict’ this value on the basis of remaining data. Hence the predicted value at each point used to assess the Kriging interpolation performances comes from the ‘re-predict’. In this study, the statistical quantities used to evaluate the quality of a set of interpolation results follow:
- (1)
Mean error (ME) is the averaged difference between the predicted and measured values. Values close to 0 are preferred. The equation for ME is as follows:
where
and
are the predicted and measured values at location
Si, respectively, and
n is the number of the sample points.
- (2)
Root mean square error (RMSE) is also the deviation between the predicted and measured values. Small RMSE values indicate improved accuracy. This index is calculated as follows:
- (3)
Root mean square standardized error (RMSSE) can be also used to evaluate the quality of a set of prediction. The value of this factor should be close to 1 if the prediction standard errors are valid. Values close to 1 are indicative of good prediction accuracy. Equation (13) shows the equation of this factor:
where
is the variance of the prediction at location
Si.
5. Summary and Discussion
In this research, we have proposed a new interpolation method (the LZ method) for refining the GNSS-derived PWV distribution map. In addition, a multifactorial regional Tm model (the MR model) for the demand of the LZ method test experiment was proposed. The relative RMSE results reflect that compared with the previous Tm model (i.e., Bevis model and Liu model), the MR model induced less difference into the resultant GNSS-derived PWV. The kernel of the LZ method consists of densifying the sample points by providing virtual sample points. Based on the statistically significant correlation within PWV and geographic coordinates/elevation at 17 original sample points, PWVs were extended from the original 17 stations to 98 uniformly distributed DEM virtual sampling points. Four-time epochs during the period from August 22 to 23, 2017 were selected to check the performance of the LZ method. The results indicate that the PWV maps generated by the LZ method have more fact-based details than that through the conventional interpolation method with only 17 original sampling points. Many more areas in the PWV map have a tendency that the value of PWV decreases with the increasing elevation. Moreover, the precipitation maps show that there is a positive correlation between precipitation and elevation in the HK. In additions, all of the accurate indicators (i.e., ME, RMSE, and RMSSE) show that the LZ method has better performance than the conventional method.
Overall, the LZ method on the basis of the application of virtual sample points resolved the insufficient horizontal resolution of PWV interpolation results caused by the sparse and uneven distribution of GNSS stations. Future work is to analyze the accuracy of the proposed approaches within different weather conditions or in different locations.