Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature

Cao, Liying; Sang, Jizhang; Li, Feijuan; Zhang, Bao

doi:10.3390/rs18091315

Open AccessArticle

Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature

School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(9), 1315; https://doi.org/10.3390/rs18091315

Submission received: 31 March 2026 / Revised: 21 April 2026 / Accepted: 22 April 2026 / Published: 25 April 2026

(This article belongs to the Special Issue Advances in Multi-GNSS Technology and Applications (2nd Edition))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The VMF1-FC demonstrates competitive RMSE performance but a larger bias in forecasted Tm relative to the GPT3 model, based on assessment at 319 global radiosonde sites.
Ensemble learning-based refined models (XTm, LTm, and CTm) effectively reduce the bias of VMF1-FC Tm and lead to further improvement in RMSE (≈18%) relative to VMF1-FC.

What are the implications of the main findings?

The refined models provide more accurate and spatially stable global-scale Tm across different latitudes, height ranges, and temporal scales.
The refined VMF1-FC Tm models have strong potential to enhance the reliability of near-real-time GNSS-based precipitable water vapor (PWV) sensing and weather forecasting applications.

Abstract

Accurately forecasting the weighted mean temperature (Tm) is critical for converting the zenith wet delay (ZWD) into global navigation satellite system (GNSS)-based precipitable water vapor (PWV) for real-time sensing and forecasting applications. The forecast Vienna Mapping Function 1 (VMF1-FC) is a global forecast product developed by TU Wien based on numerical weather prediction models and can provide grid-wise Tm one day ahead. In this study, we evaluate the accuracy of VMF1-FC-forecasted Tm using observations from 319 global radiosonde (RS) sites during 2019–2021. The results indicate that VMF1-FC-forecasted Tm shows a relatively low RMSE but a relatively large bias (0.75 K) relative to the widely used Global Pressure and Temperature 3 (GPT3) model. To improve the accuracy of VMF1-FC-forecasted Tm, three refined models, XTm, LTm, and CTm, are developed using Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost), respectively, based on observations from 319 RS sites. The models use longitude, latitude, ellipsoidal height, floating day of year (fdoy), and VMF1-FC Tm as input features, and RS Tm as the target variable. Validation using RS data from 2022 that are not involved in model development shows that the refined models significantly reduce bias, with biases of 0 K, 0 K, and −0.03 K for XTm, LTm, and CTm, respectively. Benefiting from the effective reduction in bias, the root mean square error (RMSE) is correspondingly reduced. The RMSEs of XTm, LTm, and CTm are 1.45 K, 1.45 K, and 1.46 K, respectively, achieving improvements of 18.50%/64.93%, 18.44%/64.91%, and 18.11%/64.76% compared with the VMF1-FC and GPT3 models. In addition, three refined models demonstrate higher accuracy and improve stability across different latitude bands, ellipsoidal height ranges, and temporal scales. The refined models provide more accurate global-scale Tm and offer strong potential for GNSS meteorological applications, particularly real-time GNSS-based PWV sensing and weather forecasting.

Keywords:

weighted mean temperature (Tm); refined models; ensemble learning algorithms; forecast Vienna Mapping Function 1 (VMF1-FC)

1. Introduction

Atmospheric water vapor is one of the most important greenhouse gases and is essential for various meteorological studies and applications, including weather forecasting, climate analysis, and disaster early warning [1,2,3,4,5,6]. At present, various techniques are widely used for atmospheric water vapor observation [7]. These include microwave radiometers, radiosondes, and satellite remote sensing. In addition, the global navigation satellite system (GNSS) has been increasingly applied to precipitable water vapor (PWV) sensing because of its advantages of having a low cost, high accuracy, high temporal resolution, and all-weather capability [8,9,10]. With the increasing frequency and intensity of extreme weather events, there is a growing demand in GNSS meteorology for timely and reliable atmospheric water vapor information to support rapid monitoring and forecasting. These characteristics make GNSS-based PWV especially valuable for real-time sensing and forecasting applications. The atmospheric weighted mean temperature (Tm) is the only independent parameter required for converting the zenith wet delay (ZWD) from GNSS signals into PWV [11,12,13]. Therefore, improving the accuracy of forecasted Tm is critical for enhancing the conversion from GNSS ZWD to PWV and thus improving the performance of real-time GNSS-based PWV sensing.

At present, forecasted Tm can be obtained using two main approaches: empirical models [8,14] and numerical weather prediction models [15]. For empirical models, one widely used category is based on the establishment of a linear relationship between surface temperature (Ts) and Tm. This relationship was first proposed by Bevis et al. [8] and can be expressed as Tm = a + bTs, where a and b are empirical coefficients determined through linear regression. Based on observations from 13 radiosonde (RS) sites in the United States, the coefficients were derived as a = 70.20 and b = 0.72. The resulting model achieves a global Tm accuracy of 4.74 K. However, this linear relationship between Ts and Tm varies with time and location [16]. Subsequently, numerous global or regional models accounting for seasonal and regional variability have been developed [17,18,19,20]. Wang et al. [21] utilized RS data from the Arctic and Antarctic regions during 2008–2015 to establish quadratic and linear regression Tm models for the Antarctic and Arctic regions, respectively. Both models achieve RMSE values of approximately 2.87–3.53 K, outperforming the GPT2w model in polar regions. Based on the relationship between Tm and Ts, Yang et al. [22] proposed a global Tm model (GGTm-Ts), whose validation results indicate improved global accuracy compared with the Bevis formula. However, these models rely heavily on in situ surface meteorological data and therefore suffer from application range limitations. Another category of empirical models aims to reduce dependence on in situ surface meteorological data. For example, Yao et al. [23] developed a global weighted mean temperature model (GWMT) using data from 135 global RS sites during 2005–2009, in which Tm can be estimated using only location and temporal information. Because the RS sites used for model construction were all located over land, the model produces biased Tm estimates over oceanic regions. Subsequently, Yao et al. [24] and Yao et al. [25] combined atmospheric reanalysis data to further improve the GWMT model, leading to the development of the GTm-II and GTm-III models with improved performance. In addition, Böhm et al. [26,27,28,29] considered the annual and semiannual variations in Tm and developed the global pressure and temperature (GPT) model series, including GPT2w and GPT3, based on European Centre for Medium-Range Weather Forecasts reanalysis interim (ERA-Interim) data. Based on similar methods, some Tm models have also been established [13,30,31,32]. Among these models, GPT3 is currently regarded as the most representative empirical model and demonstrates relatively high accuracy on a global scale [12,33,34].

The continuous updating of numerical weather prediction (NWP) data and the improvement in data quality provide a promising approach for Tm prediction [35,36]. Several studies have shown that NWP-based approaches outperform empirical models in tropospheric delay prediction because NWP systems assimilate up-to-date atmospheric observations, enabling a better representation of the current tropospheric state and its rapid variations [37,38,39]. Based on 24 h NWP products from the European Centre for Medium-Range Weather Forecasts (ECMWF), the VMF data service provides global gridded forecast products, including VMF1-FC and VMF3-FC. Using the grid-wise VMF1-FC, Tm can be obtained globally at arbitrary locations and times through interpolation, which gives it considerable potential for applications in real-time GNSS meteorology [40,41]. However, similar to reanalysis products, NWP data are generated through the integration and simulation of global observations. Consequently, their accuracy strongly depends on the spatial distribution and quality of the observations. As a result, higher accuracy is generally achieved in areas with dense and high-quality observations, whereas uncertainty remains in regions with sparse observational data. Consequently, spatially inconsistent accuracy may arise in forecasted tropospheric parameters, with elevated uncertainty in some local regions.

Based on the issues discussed above, the accuracy of grid-wise VMF1-FC forecasted Tm (VMF1-FC Tm) is assessed using observations from 319 global RS sites as a reference, and the widely used GPT3 model as a comparison model. Furthermore, three refined models for VMF1-FC Tm are developed based on ensemble learning algorithms, namely Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). The refined models are denoted as XTm, LTm, and CTm, respectively, and their Tm forecasting performance is evaluated across different temporal and spatial scales. The remainder of this paper is organized as follows. Section 2 introduces the datasets, models, Tm derivation, and accuracy statistical metrics used in this study. Section 3 presents the accuracy assessment of VMF1-FC Tm using RS Tm. Section 4 describes the refined modes construction. Section 5 provides a performance evaluation of the developed refined models. Section 6 summarizes the conclusions.

2. Data and Methodology

2.1. RS Tm

RS sites are generally established for upper-air meteorological observation and primarily collect upper atmospheric data by launching radiosonde balloons at two daily times (00:00 and 12:00 UTC). The Integrated Global Radiosonde Archive (IGRA) provides layered meteorological parameters, including pressure, temperature, and relative humidity, from the surface to approximately 30 km altitude for global RS sites. These data are freely available from the National Centers for Environmental Information (NCEI). Since RS observations represent in situ measurements, Tm derived from RS meteorological parameters exhibits relatively high accuracy and is commonly used as reference values for observation-based analysis and model validation [1,42,43,44,45,46]. In this study, Tm from 319 globally distributed RS sites is obtained, and the spatial distribution of the RS sites is shown in Figure 1. The Tm is calculated as follows:

T_{m} = \frac{\sum_{1}^{n} \frac{e_{i}}{T_{i}} Δ h_{i}}{\sum_{1}^{n} \frac{e_{i}}{T_{i}^{2}} Δ h_{i}}

(1)

where

e_{i}

and

T_{i}

denote the water vapor pressure (hPa) and air temperature (K) at the i-th radiosonde level, respectively, and

Δ h_{i}

represents the geopotential height difference (m) of the corresponding atmospheric layer.

n

is the total number of available vertical levels.

2.2. VMF1-FC Tm

The VMF data center provides three types of products, including post-processed products (VMF-EI), ultra-rapid products (VMF-OP), and forecast products (VMF-FC). The VMF-FC product is generated based on the 24 h numerical weather prediction models output from the European Centre for Medium-Range Weather Forecasts (ECMWF). The VMF-FC products include VMF1-FC (2.5° × 2°) and VMF3-FC (1° × 1° and 5° × 5°), and both provide grid-wise and site-wise forecasts at four epochs (00:00, 06:00, 12:00, and 18:00 UTC) for the following day [40,47]. Only VMF1-FC provides Tm. Thus, in this study, the grid-wise VMF1-FC product is used to obtain forecasted Tm at 319 RS sites. The processing procedure is described as follows: For each forecast epoch, a full gridded field of Tm is first extracted. Bilinear interpolation is then performed in the horizontal direction to derive Tm at the target RS site. Subsequently, a vertical lapse-rate model recommended by Kouba [48] is used to extrapolate Tm from the grid-point height to the RS site height, yielding the VMF1-FC Tm at that RS site and epoch. This procedure is repeated for all available epochs, and the resulting VMF1-FC Tm values are temporally matched with the corresponding RS-derived Tm for subsequent analysis. It is worth noting that the VMF1-FC product used ellipsoidal height, whereas RS profiles are provided in geopotential height. To avoid inconsistencies arising from different vertical reference systems, the geopotential heights in RS are converted to ellipsoidal heights following the method described in the literature [47].

2.3. GPT3 Tm

The GPT3 model provides grid-based empirical estimates of surface pressure, temperature, and meteorological parameters. Owing to its global coverage, low computational cost, and independence from real-time meteorological observations, GPT3 has been extensively adopted in GNSS applications, particularly for tropospheric delay correction and precipitable water vapor sensing. By inputting the modified Julian date, latitude, longitude, and ellipsoidal height, meteorological parameters such as pressure and temperature can be obtained. The calculation formula for Tm at grid points is given as follows:

\begin{array}{l} T m = A_{0} + A_{1} \cos (\frac{d o y}{365.25} 2 π) + B_{1} \sin (\frac{d o y}{365.25} 2 π) \\ + A_{2} \cos (\frac{d o y}{365.25} 4 π) + B_{2} \sin (\frac{d o y}{365.25} 4 π) \end{array}

(2)

where

d o y

represents the day of year,

A_{0}

is the mean value, and

A_{1}

,

B_{1}

,

A_{2}

, and

B_{2}

are the annual and semiannual harmonic amplitudes, respectively. The GPT3 model provides two versions with horizontal resolutions of 1° × 1° and 5° × 5°. In this study, the 1° × 1° version is adopted.

2.4. Statistical Metrics

The bias and root mean square error (RMSE) are adopted as statistical metrics to evaluate the performance of the models. The calculation formulas are given as follows:

B i a s = \frac{1}{N} \sum_{i = 1}^{N} (T m_{i}^{e s t} - T m_{i}^{r e f})

(3)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} (T m_{i}^{e s t} - T m_{i}^{r e f})}

(4)

where

T m^{e s t}

denotes the Tm derived from VMF1-FC, GPT3, and the refined models,

T m^{r e f}

represents the reference Tm obtained from RS, and

N

is the total number of samples.

3. Accuracy Assessment of VMF1-FC Tm

The accuracy of VMF1-FC Tm is first evaluated using RS Tm derived from 319 global RS sites during 2019–2021 as the reference. In addition, the Tm derived from the GPT3 model is introduced for comparative evaluation. Table 1 shows the bias and RMSE of VMF1-FC and GPT3 Tm. The results indicate that the RMSE of VMF1-FC Tm is 1.76 K, which is significantly lower than that of GPT3 (4.24 K). However, a pronounced positive bias of 0.75 K is observed for VMF1-FC, whereas GPT3 exhibits evident negative biases of −0.60 K. These results suggest that VMF1-FC tends to overestimate Tm and does not show a clear advantage over GPT3 in terms of bias performance, despite its lower RMSE.

To evaluate the spatial variability of VMF1-FC Tm, Figure 2 further illustrates the spatial distributions of bias and RMSE for GPT3 and VMF1-FC Tm at the global RS sites. As shown in Figure 2a,b, GPT3 exhibits predominantly negative biases over most regions, particularly in the mid- and high-latitude areas, whereas VMF1-FC shows mainly positive biases across the globe, with more pronounced overestimation in the Northern Hemisphere. This contrasting bias pattern is consistent with the overall bias statistics reported in Table 1. In terms of RMSE (Figure 2c,d), GPT3 generally presents larger errors, especially at mid- to high-latitude stations, where RMSE frequently exceeds 4–6 K. In contrast, VMF1-FC exhibits consistently lower RMSE values across most regions, including both continental and coastal areas.

To further assess the temporal accuracy and reliability of VMF1-FC Tm, the monthly mean bias and RMSE of GPT3 and VMF1-FC Tm are calculated, and the results are shown in Figure 3. The results indicate that the biases of GPT3 Tm are persistently negative throughout the year, with relatively larger negative values during summer months. In contrast, VMF1-FC shows consistently positive biases in all months, with limited seasonal variation. However, larger biases are observed for VMF1-FC in January, February, May, November, and December compared with GPT3. The RMSE of GPT3 exhibits pronounced seasonal variability, with larger values in winter and smaller values in summer. In contrast, the RMSE of VMF1-FC shows relatively weak seasonal variation and remains below 2 K throughout the year, showing better overall performance than GPT3.

The above results suggest that VMF1-FC has strong potential for forecasting Tm. This is supported by its superior RMSE performance relative to GPT3. However, the VMF1-FC–forecasted Tm exhibits a relatively large and unstable bias. Therefore, further refinement of VMF1-FC Tm is necessary to improve its predictive accuracy and practical applicability.

4. Development of the Ensemble Learning-Based Refined Models

Ensemble learning algorithms are employed to construct refined VMF1-FC Tm models, aiming to achieve higher accuracy in Tm estimation. During model development, feature selection and hyperparameter optimization are carefully considered to obtain refined models with optimal performance and practical applicability.

4.1. Ensemble Learning Algorithms

Ensemble learning combines multiple weak learners to construct a strong model. Common ensemble frameworks include bagging, boosting, and stacking. Among them, boosting-based ensemble learning constructs a strong learner through an iterative training strategy, in which greater emphasis is progressively placed on samples or residuals that are poorly predicted in previous iterations, as illustrated in Figure 4. This process can be interpreted as an adaptive adjustment of sample weights or contributions, guiding later learners to focus on correcting prior errors. Compared with other conventional machine learning algorithms, boosting-based ensemble methods are effective in capturing complex nonlinear relationships and improving generalization performance. By adaptively emphasizing difficult samples during training, these methods exhibit enhanced robustness to outliers and heterogeneous data distributions, making them particularly suitable for refining bias in VMF1-FC Tm. Accordingly, three representative boosting-based ensemble algorithms—XGBoost, LightGBM, and CatBoost—are employed in this study to construct the VMF1-FC Tm refined models.

XGBoost is a widely used implementation of gradient boosting decision trees that enhances model performance through second-order gradient optimization and regularization techniques. These features improve convergence stability and effectively control overfitting.

LightGBM is an efficient gradient boosting framework that accelerates training by employing histogram-based feature discretization and a leaf-wise tree growth strategy. This design significantly reduces computational cost while maintaining high predictive accuracy, which is advantageous for large-scale geophysical datasets.

CatBoost is a gradient boosting algorithm designed to improve model robustness by mitigating prediction bias through ordered boosting and advanced handling of heterogeneous feature distributions.

Readers are referred to detailed descriptions of the XGBoost, LightGBM, and CatBoost algorithms in [49,50,51].

4.2. Refined Models Development and Training Strategy

The Tm exhibits clear seasonal variability and geographic dependence [12,13]. Considering the ease of practical model implementation, longitude (lon), latitude (lat), ellipsoidal height (h), floating day of year (fdoy), and VMF1-FC Tm are used as input features, while RS Tm is used as the target variable to construct the refined VMF1-FC Tm models based on the XGBoost, LightGBM, and CatBoost algorithms, denoted as XTm, LTm, and CTm, respectively. These input features are selected to capture the key factors affecting Tm. In addition to the need to correct VMF1-FC Tm, lon and lat represent large-scale spatial variability, while h accounts for vertical dependence. The fdoy is introduced to describe seasonal and temporal variations. This combination of features enables the models to correct VMF1-FC Tm with a relatively simple feature set.

A dataset including the VMF1-FC Tm at 319 RS sites is collected, together with corresponding RS Tm and the associated lon, lat, h, and fdoy during 2019–2022. Observations from 2019 to 2021 are randomly divided into training and validation datasets, with 80% used for training and 20% used for validation. Data from 2022 are reserved as an independent test dataset to evaluate the generalization performance of the refined models.

It should be noted that interpolating grid-based Tm to non-grid locations inevitably introduces errors, mainly arising from horizontal interpolation and height correction. In the proposed approach, the refined models are trained using interpolated Tm values together with explicit positional information. By incorporating geographic coordinates and ellipsoidal height as input features, the models could learn and partially compensate for systematic errors associated with horizontal interpolation and height correction. In practical applications, once VMF1-FC-forecasted Tm products are available, the calibrated Tm can be directly obtained by inputting the corresponding VMF1-FC Tm together with location and time information into the refined models, without requiring additional interpolation. As a result, the refined models can provide calibrated Tm estimates at arbitrary locations within the study region. This strategy improves computational efficiency while effectively reducing interpolation-related errors.

4.3. Hyperparameter Determination

To construct optimal refined models, hyperparameter tuning is performed for the XGBoost, LightGBM, and CatBoost algorithms. Hyperparameters are defined as configuration parameters that control the learning process and generalization behavior of machine learning models [52]. Appropriate hyperparameter selection is critical for balancing model complexity and prediction accuracy, and for avoiding overfitting.

In this study, hyperparameter optimization is conducted using the GridSearch Cross-Validation (GridSearchCV) framework implemented in Scikit-learn (sklearn.model_selection.GridSearchCV), which systematically evaluates predefined combinations of hyperparameters through cross-validation. This approach enables an objective and reproducible selection of optimal hyperparameter sets by minimizing validation errors across multiple folds. The theoretical background of GridSearchCV can be found in [47,53]. The hyperparameter tuning is conducted using a stepwise strategy. Initially, key hyperparameters are assigned reasonable initial values together with predefined candidate values arranged in ascending order. Subsequently, the hyperparameters are tuned in a predefined order by specifying their search ranges and determining the optimal values, while the remaining parameters are temporarily fixed. For example, the number of estimators (n_estimators) is first tuned using predefined candidate values ranging from 100 to 500 with a step size of 100. After the optimal value is identified, it replaces the initial setting and is subsequently fixed during the tuning of the remaining hyperparameters. It should be noted that if the optimal value of a hyperparameter is located at the boundary of the predefined candidate values (i.e., the minimum or maximum value), the candidate set is further expanded and the tuning process is repeated to ensure that a true optimum is obtained. The tuning order and predefined candidate values of the hyperparameters for the three ensemble algorithms are summarized in Table 2. The overall workflow of the refined model construction is illustrated in Figure 5.

5. Performance Assessment of Refined Models

The performance of three refined models is assessed using test datasets from multiple perspectives, including overall global accuracy and its variations with latitude, ellipsoidal height, and time scales (seasonal and daily).

5.1. Global Accuracy

The overall global accuracy based on the test dataset is summarized in Table 3 for XTm, LTm, CTm, GPT3, and VMF1-FC. The validation accuracy is also presented in Table 3 for comparison. For the validation accuracy, the biases of XTm, LTm, and CTm are 0.00 K, 0.00 K, and 0.00 K, respectively, while those of GPT3 and VMF1-FC are −0.60 K and 0.75 K. The corresponding RMSE values of XTm, LTm, and CTm are 1.37 K, 1.39 K, and 1.36 K, respectively, which are lower than those of VMF1-FC (1.76 K) and GPT3 (4.24 K). The test results show that the biases of XTm, LTm, and CTm are 0.00 K, 0.00 K, and −0.03 K, respectively, whereas those of GPT3 and VMF1-FC are −0.56 K and 0.74 K. The test results are consistent with the validation results, indicating no obvious overfitting. The corresponding reduction rates based on the test results relative to VMF1-FC and GPT3 are also summarized in Table 3. The refined models effectively correct the bias, with reduction rates of 99.93%/99.91%, 99.75%/99.67%, and 95.49%/93.98% with respect to VMF1-FC and GPT3. The RMSE of XTm, LTm, and CTm are 1.45 K, 1.45 K, and 1.46 K, respectively. Compared with VMF1-FC (1.78 K) and GPT3 (4.13 K), the RMSE of XTm, LTm, and CTm is reduced by 18.50%/64.93%, 18.44%/64.91%, and 18.11%/64.76%, respectively. Overall, XTm, LTm, and CTm exhibit near-zero biases and lower RMSEs, indicating a clear improvement in Tm estimation accuracy.

To further assess the spatial stability of refined models, the spatial distributions of Tm accuracy derived from GPT3, VMF1-FC, XTm, LTm, and CTm are illustrated in Figure 6 and Figure 7. The GPT3 and VMF1-FC exhibit obvious spatial heterogeneity in both bias and RMSE. The GPT3 shows predominantly positive biases and larger RMSEs over continental interiors and negative biases and smaller RMSEs along coastal regions. The VMF1-FC is characterized mainly by positive biases, with both bias and RMSE exhibiting relatively large values over northeastern Eurasia and northeastern North America. In contrast, the distributions of bias and RMSE in XTm, LTm, and CTm are more spatially homogeneous at the global scale, with markedly reduced regional variability.

Overall, XTm, LTm, and CTm achieve lower bias and RMSE at the global scale and exhibit more spatially homogeneous accuracy distributions across stations relative to VMF1-FC and GPT3.

5.2. Accuracy of Models in Different Latitude Belts

The variation in Tm accuracy with latitude has been widely reported [54,55,56]. To illustrate the latitude-dependent performance of XTm, LTm, and CTm, the globe is divided into six latitude bands with an interval of 30°. For each latitude band, the mean bias and RMSE of Tm derived from GPT3, VMF1-FC, and the refined models are calculated based on the sites located within each band, as shown in Figure 8. In terms of bias, the GPT3 exhibits relatively large bias in the mid- and high-latitude regions of the Southern Hemisphere and shows negative bias across five global latitude bands. The maximum absolute bias reaches 2.21 K, indicating an underestimation of Tm. In contrast, VMF1-FC presents a relatively large bias in the Northern Hemisphere and positive bias across five latitude bands, with a maximum value of 1.07 K, suggesting a clear overestimation of Tm. The absolute biases of XTm, LTm, and CTm across all six latitude bands are much closer to 0 K. This demonstrates that the refined models effectively eliminate latitude-dependent bias, resulting in Tm estimates that are closer to the reference values and thus achieve higher accuracy. For RMSE, the GPT3 yields the largest RMSE across all six latitude bands, with RMSE increasing from low to high latitude. The RMSE of VMF1-FC shows relatively small variations with latitude but also displays larger RMSE in high-latitude regions. Compared with VMF1-FC, the refined models also show improved RMSE performance across all latitude bands, particularly showing notable improvements in mid- and high latitudes, resulting in a more uniform RMSE across all bands. This may be attributed to the effective correction of bias.

In summary, the refined models effectively reduce the latitude-dependent variability in Tm accuracy, indicating that the proposed refined approach is capable of providing more accurate Tm estimates on a global scale.

5.3. Accuracy of Models in Different Ellipsoidal Height Ranges

The Tm accuracy is also strongly influenced by height [57]. Therefore, to assess the height-dependent performance of refined models, the ellipsoidal height range of the 319 sites (−50 to 2500 m) is divided into six height intervals: −50–20 m, 20–50 m, 50–100 m, 100–200 m, 200–500 m, and >500 m. Figure 9 shows the bias and RMSE of Tm derived from different models within each height interval. The bias of Tm derived from VMF1-FC and GPT3 exhibits complementary patterns with height. In the 100–200 m height interval, GPT3 shows relatively small bias, whereas VMF1-FC exhibits larger bias. Conversely, in the 20–50 m and 200–500 m height intervals, a smaller bias is observed for VMF1-FC, while a larger bias is present in GPT3. In contrast, the biases of XTm, LTm, and CTm across all height intervals are consistently closer to 0 K than those of VMF1-FC and GPT3, indicating that the refined models effectively reduce height-dependent bias.

Meanwhile, VMF1-FC, XTm, LTm, and CTm exhibit clear accuracy advantages in RMSE compared with GPT3. Relative to VMF1-FC, the RMSEs of XTm, LTm, and CTm show further improvement in all height ranges. To further quantify the RMSE improvement achieved by XTm, LTm, and CTm relative to VMF1-FC, Table 4 summarizes the RMSE values across the six ellipsoidal height intervals. The RMSEs of XTm, LTm, and CTm are improved by at least 5% in all height intervals compared with VMF1-FC. The most pronounced improvements occur at low and mid-height ranges. In particular, the RMSE in the 100–200 m height range decreases from 2.24 K in VMF1-FC to approximately 1.60–1.61 K for the refined models, corresponding to a reduction of nearly 28–29%. At higher heights (>200 m), although the absolute RMSE values are generally lower, XTm, LTm, and CTm still achieve stable reductions of approximately 5–6%.

Overall, XTm, LTm, and CTm demonstrate superior accuracy and stability across different height intervals compared with GPT3 and VMF1-FC. The height-dependent variability of Tm accuracy is effectively mitigated, highlighting the strong performance and robustness of the proposed refined models.

5.4. Accuracy of Models in Different Time

The Tm exhibits a certain correlation with seasonal variation [58,59]. To assess the seasonal characteristics of model performance, the bias and RMSE of Tm derived from five models are calculated for different seasons. The results are presented as error-bar plots in Figure 10. In each panel, the central circle represents the seasonal mean value, while the length of the corresponding error bar reflects the dispersion of model errors across sites within the corresponding season. For bias, GPT3 and VMF3-FC show noticeable seasonal dependence, with larger deviations and longer error bars, particularly in autumn and winter. In contrast, XTm, LTm, and CTm exhibit biases consistently centered near zero across all seasons, together with markedly reduced dispersion. A similar seasonal pattern is observed for RMSE. GPT3 exhibits larger errors and stronger seasonal contrasts across different seasons. Compared with GPT3, VMF1-FC shows small RMSE and weakened seasonal variability, indicating superior seasonal performance relative to GPT3. XTm, LTm, and CTm further reduce RMSE and maintain consistently low value levels with limited seasonal variability. These results indicate that although Tm is inherently influenced by seasonal thermal variability, the refined models provide the most consistent and seasonally stable Tm estimates in both bias and RMSE.

Seasonal analysis reflects long-term (low-frequency) variability. To further characterize short-term (high-frequency) variations, diurnal analysis is conducted. Both analyses are complementary for evaluating model performance across different temporal scales. The daily mean bias and RMSE time series are shown in Figure 11. As shown in the upper panel of Figure 11, the bias of GPT3 is more dispersed and exhibits larger fluctuations throughout the year compared with VMF1-FC, XTm, LTm, and CTm, indicating unstable temporal behavior. The bias of VMF1-FC remains positive throughout the year and shows relatively stable variation. The daily bias variations in XTm, LTm, and CTm are similar to those of VMF1-FC but are consistently closer to 0 K over the entire year, demonstrating better agreement with RS Tm. The bottom panel of Figure 11 shows that GPT3 has the largest daily mean RMSE and the strongest temporal fluctuations, with clear seasonal variation. In contrast, the daily mean RMSEs of VMF1-FC and the three refined models remain relatively stable throughout the year. Although the annual RMSE levels of VMF1-FC and the refined models are comparable, the refined models yield smaller daily RMSE values for most epochs, indicating superior performance relative to VMF1-FC.

Overall, XTm, LTm, and CTm are less affected by seasonal variation than GPT3 and VMF1-FC and exhibit higher accuracy and reliability. Moreover, the Tm derived from XTm, LTm, and CTm shows good consistency with RS Tm daily, demonstrating enhanced temporal performance and improved stability.

6. Conclusions

In this study, the forecast Tm accuracy of the grid-wise VMF1-FC is first assessed using RS Tm from 319 global RS sites during 2019–2021 and the GPT3 model. The assessment results indicate that although VMF1-FC Tm exhibits a clear advantage in RMSE, it also displays a relatively large bias. In addition, both the RMSE and bias of VMF1-FC Tm show noticeable spatial heterogeneity and seasonal variability. To improve the VMF1-FC-forecasted Tm, three refined VMF1-FC Tm models (XTm, LTm, and CTm) are developed based on XGBoost, LightGBM, and CatBoost. Validation using independent RS observations in 2022 demonstrates that the XTm, LTm, and CTm exhibit better consistency with RS Tm, with biases of 0 K, 0 K, and −0.03 K, respectively. Benefiting from the bias reduction, the RMSEs of the refined models are also improved, with improvement rates of 18.50%/64.93%, 18.44%/64.91%, and 18.11%/64.76% relative to VMF1-FC and GPT3. Furthermore, XTm, LTm, and CTm consistently exhibit superior performance across different latitude bands, ellipsoidal height ranges, and temporal scales relative to VMF1-FC and GPT3. Therefore, the proposed refined models can effectively improve the accuracy of Tm used in GNSS meteorology, leading to more reliable conversion from GNSS ZWD to PWV. This enhancement is particularly beneficial for real-time GNSS-based PWV and weather forecasting applications. Future work will improve model generalization under complex atmospheric conditions, like extreme rainfall.

Author Contributions

Conceptualization, L.C. and J.S.; methodology, L.C. and J.S.; software, L.C. and F.L.; validation, L.C. and J.S.; formal analysis, L.C. and J.S.; investigation, F.L.; resources, L.C., J.S., F.L. and B.Z.; data curation, L.C., F.L. and B.Z.; writing—original draft preparation, L.C. and J.S.; writing—review and editing, L.C., J.S., F.L. and B.Z.; visualization, L.C.; supervision, F.L. and B.Z.; project administration, J.S.; funding acquisition, J.S. and B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Guangxi Natural Science Foundation of China (GuikeFN2600640229), the National Natural Science Foundation of China (12373083, 12403083), and the Natural Science Foundation of Hubei Province (2024AFA003).

Data Availability Statement

The grid VMF1-FC product and the GPT3 model are provided by the VMF Data Server at https://vmf.geo.tuwien.ac.at/ (accessed on 28 March 2026). Radiosonde data can be downloaded from the Integrated Global Radiosonde Archive (IGRA) at https://www.ncei.noaa.gov/pub/data/igra/ (accessed on 28 March 2026).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, F.; Liu, M.; Zhao, Y.; An, X.; Wang, L.; Wen, Z. Higher Accuracy Estimation of the Weighted Mean Temperature (Tm) Using GPT3 Model with New Grid Coefficients over China. Atmos. Res. 2024, 305, 107424. [Google Scholar] [CrossRef]
Weckwerth, T.M.; Parsons, D.B.; Koch, S.E.; Moore, J.A.; LeMone, M.A.; Demoz, B.B.; Flamant, C.; Geerts, B.; Wang, J.; Feltz, W.F. An Overview of the International H2O Project (IHOP_2002) and Some Preliminary Highlights. Bull. Am. Meteorol. Soc. 2004, 85, 253–278. [Google Scholar] [CrossRef]
Zhao, Q.; Liu, Y.; Ma, X.; Yao, W.; Yao, Y.; Li, X. An Improved Rainfall Forecasting Model Based on GNSS Observations. IEEE Trans. Geosci. Remote Sens. 2020, 58, 4891–4900. [Google Scholar] [CrossRef]
Zhao, Q.; Yang, P.; Yao, W.; Yao, Y. Adaptive AOD Forecast Model Based on GNSS-Derived PWV and Meteorological Parameters. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5800610. [Google Scholar] [CrossRef]
Xu, J.; Liu, Z. An Observed Relationship Between Satellite-Estimated Transmittance and Ground-Estimated Water Vapor: Implications for High-Temporal-Resolution Water Vapor Retrieval from Non-Geostationary Satellite Measurements. Geophys. Res. Lett. 2024, 51, e2024GL111033. [Google Scholar] [CrossRef]
Li, H.; Zhao, Q.; Guo, H.; Li, Z.; Ma, Y.; Yao, Y.; Yin, J.; Zhai, Y.; Liang, H.; Xiong, Z. A GNSS PWV Filling and Short-Term Forecasting Framework Fused Hybrid Neural Network. Atmos. Res. 2026, 329, 108508. [Google Scholar] [CrossRef]
Huang, L.; Wang, X.; Xiong, S.; Li, J.; Liu, L.; Mo, Z.; Fu, B.; He, H. High-Precision GNSS PWV Retrieval Using Dense GNSS Sites and in-Situ Meteorological Observations for the Evaluation of MERRA-2 and ERA5 Reanalysis Products over China. Atmos. Res. 2022, 276, 106247. [Google Scholar] [CrossRef]
Bevis, M.; Businger, S.; Herring, T.A.; Rocken, C.; Anthes, R.A.; Ware, R.H. GPS Meteorology: Remote Sensing of Atmospheric Water Vapor Using the Global Positioning System. J. Geophys. Res. 1992, 97, 15787–15801. [Google Scholar] [CrossRef]
Rocken, C.; Hove, T.V.; Johnson, J.; Solheim, F.; Ware, R.; Bevis, M.; Chiswell, S.; Businger, S. GPS/STORM—GPS Sensing of Atmospheric Water Vapor for Meteorology. J. Atmos. Ocean. Technol. 1995, 12, 468–478. [Google Scholar] [CrossRef]
Du, Z.; Zhang, B.; Yao, Y.; Zhao, Q.; Zhang, L. Integrating Near-Infrared, Thermal Infrared, and Microwave Satellite Observations to Retrieve High-Resolution Precipitable Water Vapor. Remote Sens. Environ. 2025, 318, 114611. [Google Scholar] [CrossRef]
Wang, J.; Liu, Z. Improving GNSS PPP Accuracy through WVR PWV Augmentation. J. Geod. 2019, 93, 1685–1705. [Google Scholar] [CrossRef]
Xie, S.; Zhang, J.; Huang, L.; Chen, F.; Wu, Y.; Wang, Y.; Liu, L. A Hybrid-Grid Global Model for the Estimation of Atmospheric Weighted Mean Temperature Considering Time-Varying Vertical Adjustment Rate in GNSS Precipitable Water Vapour Retrieval. Geosci. Model Dev. 2025, 18, 6987–7002. [Google Scholar] [CrossRef]
Zhang, B.; Wu, T.; Shen, Y. Atmospheric Weighted Average Temperature Enhancement Model for the European Region Considering Daily Variations and Residual Changes in Surface Temperature. Remote Sens. 2025, 18, 36. [Google Scholar] [CrossRef]
Zhu, M.; Yu, X.; Sun, W. A Coalescent Grid Model of Weighted Mean Temperature for China Region Based on Feedforward Neural Network Algorithm. GPS Solut. 2022, 26, 70. [Google Scholar] [CrossRef]
Wang, X.; Zhang, K.; Wu, S.; Fan, S.; Cheng, Y. Water Vapor-weighted Mean Temperature and Its Impact on the Determination of Precipitable Water Vapor and Its Linear Trend. JGR Atmos. 2016, 121, 833–852. [Google Scholar] [CrossRef]
Ross, R.J.; Rosenfeld, S. Estimating Mean Weighted Temperature of the Atmosphere for Global Positioning System Applications. J. Geophys. Res. 1997, 102, 21719–21730. [Google Scholar] [CrossRef]
Liou, Y.-A.; Teng, Y.-T.; Van Hove, T.; Liljegren, J.C. Comparison of Precipitable Water Observations in the Near Tropics by GPS, Microwave Radiometer, and Radiosondes. J. Appl. Meteor. 2001, 40, 5–15. [Google Scholar] [CrossRef]
Yao, Y.; Zhang, B.; Xu, C.; Yan, F. Improved One/Multi-Parameter Models That Consider Seasonal and Geographic Variations for Estimating Weighted Mean Temperature in Ground-Based GPS Meteorology. J. Geod. 2014, 88, 273–282. [Google Scholar] [CrossRef]
Mekik, C.; Deniz, I. Modelling and Validation of the Weighted Mean Temperature for Turkey. Meteorol. Appl. 2017, 24, 92–100. [Google Scholar] [CrossRef]
Liu, J.; Yao, Y.; Sang, J. A New Weighted Mean Temperature Model in China. Adv. Space Res. 2018, 61, 402–412. [Google Scholar] [CrossRef]
Wang, S.; Xu, T.; Nie, W.; Wang, J.; Xu, G. Establishment of Atmospheric Weighted Mean Temperature Model in the Polar Regions. Adv. Space Res. 2020, 65, 518–528. [Google Scholar] [CrossRef]
Yang, F.; Guo, J.; Meng, X.; Li, J.; Li, Z.; Tang, W. GGTm-Ts: A Global Grid Model of Weighted Mean Temperature (Tm) Based on Surface Temperature (Ts) with Two Modes. Adv. Space Res. 2023, 71, 1510–1524. [Google Scholar] [CrossRef]
Yao, Y.; Zhu, S.; Yue, S. A Globally Applicable, Season-Specific Model for Estimating the Weighted Mean Temperature of the Atmosphere. J. Geod. 2012, 86, 1125–1135. [Google Scholar] [CrossRef]
Yao, Y.B.; Zhang, B.; Yue, S.Q.; Xu, C.Q.; Peng, W.F. Global Empirical Model for Mapping Zenith Wet Delays onto Precipitable Water. J. Geod. 2013, 87, 439–448. [Google Scholar] [CrossRef]
Yao, Y.; Xu, C.; Zhang, B.; Cao, N. GTm-III: A New Global Empirical Model for Mapping Zenith Wet Delays onto Precipitable Water Vapour. Geophys. J. Int. 2014, 197, 202–212. [Google Scholar] [CrossRef]
Boehm, J.; Heinkelmann, R.; Schuh, H. Short Note: A Global Model of Pressure and Temperature for Geodetic Applications. J. Geod. 2007, 81, 679–683. [Google Scholar] [CrossRef]
Böhm, J.; Lagler, K.; Schindelegger, M.; Krásná, H.; Weber, R.; Möller, G. GPT2: An Improved Model for Tropospheric Slant Delays in VLBI and GNSS Analysis. In Proceedings of the European Navigation Conference Proceedings, Vienna, Austria, 23–25 April 2013; p. 4. [Google Scholar]
Böhm, J.; Möller, G.; Schindelegger, M.; Pain, G.; Weber, R. Development of an Improved Empirical Model for Slant Delays in the Troposphere (GPT2w). GPS Solut. 2015, 19, 433–441. [Google Scholar] [CrossRef]
Landskron, D.; Böhm, J. VMF3/GPT3: Refined Discrete and Empirical Troposphere Mapping Functions. J. Geod. 2018, 92, 349–360. [Google Scholar] [CrossRef]
Sun, Z.; Zhang, B.; Yao, Y. A Global Model for Estimating Tropospheric Delay and Weighted Mean Temperature Developed with Atmospheric Reanalysis Data from 1979 to 2017. Remote Sens. 2019, 11, 1893. [Google Scholar] [CrossRef]
Yang, F.; Guo, J.; Meng, X.; Shi, J.; Zhang, D.; Zhao, Y. An Improved Weighted Mean Temperature (Tm) Model Based on GPT2w with Tm Lapse Rate. GPS Solut. 2020, 24, 46. [Google Scholar] [CrossRef]
Sun, P.; Wu, S.; Zhang, K.; Wan, M.; Wang, R. A New Global Grid-Based Weighted Mean Temperature Model Considering Vertical Nonlinear Variation. Atmos. Meas. Tech. 2021, 14, 2529–2542. [Google Scholar] [CrossRef]
Zhu, Y.; Sha, Z.; Wei, P.; Ye, S.; Xia, P.; Hu, F. GFZTD: A Multimodal Fusion-Driven 3-D Tropospheric Delay Prediction Model Coupling Self-Attention and ConvLSTM. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2026, 19, 6375–6388. [Google Scholar] [CrossRef]
Zhang, B.; Zhu, J.; Shen, Y. Multi-Scale Accuracy Evaluation and Adaptability Analysis of Different Atmospheric Weighted Mean Temperature Models in China. J. Appl. Geod. 2026, in press. [Google Scholar] [CrossRef]
Bevis, M.; Businger, S.; Chiswell, S.; Herring, T.A.; Anthes, R.A.; Rocken, C.; Ware, R.H. GPS Meteorology: Mapping Zenith Wet Delays onto Precipitable Water. J. Appl. Meteor. 1994, 33, 379–386. [Google Scholar] [CrossRef]
Chen, Q.; Song, S.; Heise, S.; Liou, Y.-A.; Zhu, W.; Zhao, J. Assessment of ZTD Derived from ECMWF/NCEP Data with GPS ZTD over China. GPS Solut. 2011, 15, 415–425. [Google Scholar] [CrossRef]
Hadas, T.; Teferle, F.N.; Kazmierski, K.; Hordyniec, P.; Bosy, J. Optimum Stochastic Modeling for GNSS Tropospheric Delay Estimation in Real-Time. GPS Solut. 2017, 21, 1069–1081. [Google Scholar] [CrossRef]
Yuan, Y.; Holden, L.; Kealy, A.; Choy, S.; Hordyniec, P. Assessment of Forecast Vienna Mapping Function 1 for Real-Time Tropospheric Delay Modeling in GNSS. J. Geod. 2019, 93, 1501–1514. [Google Scholar] [CrossRef]
Zhang, H.; Yuan, Y.; Li, W. Real-Time Wide-Area Precise Tropospheric Corrections (WAPTCs) Jointly Using GNSS and NWP Forecasts for China. J. Geod. 2022, 96, 44. [Google Scholar] [CrossRef]
Boehm, J.; Kouba, J.; Schuh, H. Forecast Vienna Mapping Functions 1 for Real-Time Analysis of Space Geodetic Observations. J. Geod. 2009, 83, 397–401. [Google Scholar] [CrossRef]
Ding, J.; Mi, X.; Chen, W.; Chen, J.; Wang, J.; Zhang, Y.; Awange, J.L.; Soja, B.; Bai, L.; Deng, Y.; et al. Forecasting of Tropospheric Delay Using AI Foundation Models in Support of Microwave Remote Sensing. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5803019. [Google Scholar] [CrossRef]
Wang, J.; Zhang, L.; Dai, A. Global Estimates of Water-vapor-weighted Mean Temperature of the Atmosphere for GPS Applications. J. Geophys. Res. 2005, 110, 2005JD006215. [Google Scholar] [CrossRef]
Ding, M. A Neural Network Model for Predicting Weighted Mean Temperature. J. Geod. 2018, 92, 1187–1198. [Google Scholar] [CrossRef]
Aragón Paz, J.M.; Mendoza, L.P.O.; Fernández, L.I. Near-Real-Time GNSS Tropospheric IWV Monitoring System for South America. GPS Solut. 2023, 27, 93. [Google Scholar] [CrossRef]
Hu, M.; Li, J.; Yao, C.; Li, F.; Liu, L.; Huang, L.; Wang, Y. Higher Accuracy Estimation of the Weighted Mean Temperature (Tm) with the Aid of Machine Learning and NWP Model. All Earth 2025, 37, 1–14. [Google Scholar] [CrossRef]
Wang, H.; Yang, F.; Zheng, J.; Wang, Z.; Chen, W.; Xie, J. From GNSS Zenith Tropospheric Delay to Precipitable Water Vapor: Accuracy Assessment Using In-Situ and Reanalysis Meteorological Data Over China. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 14582–14593. [Google Scholar] [CrossRef]
Li, F.; Li, J.; Liu, L.; Huang, L.; Zhou, L.; He, H. Machine Learning-Based Calibrated Model for Forecast Vienna Mapping Function 3 Zenith Wet Delay. Remote Sens. 2023, 15, 4824. [Google Scholar] [CrossRef]
Kouba, J. Implementation and Testing of the Gridded Vienna Mapping Function 1 (VMF1). J. Geod. 2008, 82, 193–205. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
Yao, X.; Fu, X.; Zong, C. Short-Term Load Forecasting Method Based on Feature Preference Strategy and LightGBM-XGboost. IEEE Access 2022, 10, 75257–75268. [Google Scholar] [CrossRef]
Liang, C.-W.; Chang, C.-C.; Hsiao, C.-Y.; Liang, C.-J. Prediction and Analysis of Atmospheric Visibility in Five Terrain Types with Artificial Intelligence. Heliyon 2023, 9, e19281. [Google Scholar] [CrossRef]
Li, J.; Li, F.; Liu, L.; Yao, Y.; Huang, L.; Wang, Y. A Weighted Mean Temperature Forecast Model Based on Fused Data and Generalized Regression Neural Network and Its Impact on GNSS-Based Precipitable Water Vapor Estimation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4101914. [Google Scholar] [CrossRef]
Yang, F.; Wang, L.; Li, Z.; Tang, W.; Meng, X. A Weighted Mean Temperature (Tm) Augmentation Method Based on Global Latitude Zone. GPS Solut. 2022, 26, 141. [Google Scholar] [CrossRef]
Ma, Y.; Zhao, Q.; Wu, K.; Yao, W.; Liu, Y.; Li, Z.; Shi, Y. Comprehensive Analysis and Validation of the Atmospheric Weighted Mean Temperature Models in China. Remote Sens. 2022, 14, 3435. [Google Scholar] [CrossRef]
Sun, Z.; Zhang, B.; Yao, Y. An ERA5-Based Model for Estimating Tropospheric Delay and Weighted Mean Temperature Over China with Improved Spatiotemporal Resolutions. Earth Space Sci. 2019, 6, 1926–1941. [Google Scholar] [CrossRef]
Maghrabi, A.H.; Alothman, A.A.; Almutairi, M.M.; Aldosari, A.F.; Aldakhil, A.A.; Allehyani, B.I.; Aljarbar, G.A.; Altilasi, M.I. Variations and Modeling of the Atmospheric Weighted Mean Temperature for Ground-Based GNSS Applications: Central Arabian Peninsula. Adv. Space Res. 2018, 62, 2431–2442. [Google Scholar] [CrossRef]
Sun, Y.; Yang, F.; Liu, M.; Li, Z.; Gong, X.; Wang, Y. Evaluation of the Weighted Mean Temperature over China Using Multiple Reanalysis Data and Radiosonde. Atmos. Res. 2023, 285, 106664. [Google Scholar] [CrossRef]

Figure 1. Global distribution of 319 radiosonde sites.

Figure 2. Spatial distributions of bias and RMSE of GPT3 and VMF1-FC Tm. (a) GPT3 bias; (b) VMF1-FC bias; (c) GPT3 RMSE; (d) VMF1-FC RMSE.

Figure 3. Monthly mean bias and RMSE of GPT3 and VMF1-FC Tm.

Figure 4. Structure of the boosting algorithm. The ellipsis (“…”) indicates repeated iterations of the training process.

Figure 5. Workflow of the development of refined models.

Figure 6. Station-based distributions of bias in Tm derived from GPT3, VMF1-FC, XTm, LTm, and CTm. (a) GPT3; (b) VMF1-FC; (c) XTm; (d) LTm; (e) CTm.

Figure 7. Station-based distributions of RMSE in Tm derived from GPT3, VMF1-FC, XTm, LTm, and CTm. (a) GPT3; (b) VMF1-FC; (c) XTm; (d) LTm; (e) CTm.

Figure 8. Bias and RMSE of Tm across different latitude bands for GPT3, VMF1-FC, XTm, LTm, and CTm.

Figure 9. Bias and RMSE of Tm across different ellipsoidal height ranges for GPT3, VMF1-FC, XTm, LTm, and CTm.

Figure 10. Seasonal variations in bias (a) and RMSE (b) in Tm for GPT3, VMF1-FC, XTm, LTm, and CTm.

Figure 11. Daily mean bias (a) and RMSE (b) of Tm for GPT3, VMF1-FC, XTm, LTm, and CTm.

Table 1. Bias and RMSE of GPT3 and VMF1-FC Tm.

	Bias (K)	RMSE (K)
GPT3	−0.60	4.24
VMF1-FC	0.75	1.76

Table 2. Tuning order and predefined candidate values of hyperparameters for the three ensemble algorithms.

	Order	Hyperparameters	Initial Value	Tried Value
XGBoost	1	n_estimators	500	[100, 200, 300, 400, 500]
	2	max_depth	5	[3, 4, 5, 6, 7, 8, 9, 10]
	2	min_child_weight	1	[1, 2, 3, 4, 5, 6]
	3	gamma	0	[0, 0.1, 0.2, 0.3, 0.4, 0.5]
	4	subsample	0.8	[0.6, 0.7, 0.8, 0.9]
	4	colsample_bytree	0.8	[0.6, 0.7, 0.8, 0.9]
	5	learning_rate	0.1	[0.01, 0.05, 0.07, 0.1, 0.2]
LightGBM	1	n_estimators	100	[100, 200, 300, 400, 500]
	2	max_depth	3	[3, 4, 5, 6, 7, 8, 9, 10]
	2	num_leaves	10	[10:10:150]
	3	min_child_samples	10	[10:1:16]
	3	min_child_weight	0.001	[0.001, 0.002]
	4	max_bin	512	[64, 128, 256, 512]
	5	feature_fraction	1	[0.6, 0.8, 1]
	6	learning_rate	0.1	[0.01, 0.05, 0.07, 0.1, 0.2]
CatBoost	1	depth	7	[3, 4, 5, 6, 7, 8, 9, 10]
	2	learning_rate	0.1	[0.01, 0.05, 0.07, 0.1, 0.2]
	3	l2_leaf_reg	1	[1:1:9]
	4	iterations	100	[100, 200, 300, 400, 500]

Table 3. Overall global accuracy of different models.

	Validation		Test		Reduction (%) (Based on Test)
	Bias (K)	RMSE (K)	Bias (K)	RMSE (K)	Bias vs. VMF1-FC	Bias vs. GPT3	RMSE vs. VMF1-FC	RMSE vs. GPT3
GPT3	−0.60	4.24	−0.56	4.13	/	/	/	/
VMF1-FC	0.75	1.76	0.74	1.78	/	/	/	/
XTm	0.00	1.37	0.00	1.45	99.93	99.91	18.50	64.93
LTm	0.00	1.39	0.00	1.45	99.75	99.67	18.44	64.91
CTm	0.00	1.36	−0.03	1.46	95.49	93.98	18.11	64.76

Note. The bias and RMSE values reported in this table are rounded for presentation, whereas the corresponding improvement rates are calculated from the original unrounded outputs. Therefore, slight differences may occur between the reported values and the percentages directly back-calculated from the rounded numbers.

Table 4. RMSE (K) of Tm for VMF1-FC, XTm, LTm, and CTm across different ellipsoidal height ranges.

Ellipsoidal Height Range (m)	−50~20	20~50	50~100	100~200	200~500	>500
VMF1-FC	1.51	1.39	1.72	2.24	1.54	1.35
XTm	1.18	1.26	1.47	1.60	1.44	1.28
LTm	1.18	1.26	1.46	1.60	1.46	1.28
CTm	1.19	1.25	1.48	1.61	1.45	1.28

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, L.; Sang, J.; Li, F.; Zhang, B. Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature. Remote Sens. 2026, 18, 1315. https://doi.org/10.3390/rs18091315

AMA Style

Cao L, Sang J, Li F, Zhang B. Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature. Remote Sensing. 2026; 18(9):1315. https://doi.org/10.3390/rs18091315

Chicago/Turabian Style

Cao, Liying, Jizhang Sang, Feijuan Li, and Bao Zhang. 2026. "Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature" Remote Sensing 18, no. 9: 1315. https://doi.org/10.3390/rs18091315

APA Style

Cao, L., Sang, J., Li, F., & Zhang, B. (2026). Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature. Remote Sensing, 18(9), 1315. https://doi.org/10.3390/rs18091315

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Global Ensemble Learning-Based Refined Models for VMF1-FC Forecasted Weighted Mean Temperature

Highlights

Abstract

1. Introduction

2. Data and Methodology

2.1. RS Tm

2.2. VMF1-FC Tm

2.3. GPT3 Tm

2.4. Statistical Metrics

3. Accuracy Assessment of VMF1-FC Tm

4. Development of the Ensemble Learning-Based Refined Models

4.1. Ensemble Learning Algorithms

4.2. Refined Models Development and Training Strategy

4.3. Hyperparameter Determination

5. Performance Assessment of Refined Models

5.1. Global Accuracy

5.2. Accuracy of Models in Different Latitude Belts

5.3. Accuracy of Models in Different Ellipsoidal Height Ranges

5.4. Accuracy of Models in Different Time

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI