Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China

Sun, Minghan; Pang, Zhiguo; Lu, Jingxuan; Jiang, Wei; Qin, Xiangdong; Zhou, Zhuoyue

doi:10.3390/rs18010104

Open AccessArticle

Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China

by

Minghan Sun

^1,2

,

Zhiguo Pang

^1,2,*

,

Jingxuan Lu

^1,2

,

Wei Jiang

^1,2

,

Xiangdong Qin

^1,2 and

Zhuoyue Zhou

^1,2

¹

State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, China Institute of Water Resources and Hydropower Research, Beijing 100038, China

²

Research Center of Flood and Drought Disaster Reduction of the Ministry of Water Resources, Beijing 100038, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(1), 104; https://doi.org/10.3390/rs18010104 (registering DOI)

Submission received: 24 November 2025 / Revised: 18 December 2025 / Accepted: 25 December 2025 / Published: 27 December 2025

(This article belongs to the Topic Advances in Earth Observation Technologies to Support Water-Related Sustainable Development Goals (SDGs))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Developed a seasonal random forest fusion model based on dry-wet season residual differences, integrating BDS- and MODIS-PWV to generate high-accuracy, spatially continuous daily PWV fields.
The fusion PWV data exhibit high consistency with RS-PWV, achieving correlations above 0.95 and improving accuracy by more than 75% compared with the original MODIS data.

What are the implications of the main findings?

The fusion model overcomes MODIS underestimation and weather sensitivity, providing a reliable approach for accurate, spatially continuous PWV retrieval.
The fusion PWV dataset enhances regional water vapor analysis and supports improved precipitation and weather forecasting in complex terrains.

Abstract

The accurate monitoring of water vapor is essential for understanding the hydrological cycle and improving weather forecasting. Although the Moderate-resolution Imaging Spectroradiometer (MODIS) provides spatially continuous precipitable water vapor (PWV), validation in Hunan Province reveals a systematic underestimation, with correlations to radiosonde (RS-PWV) around 0.40 and average RMSE and MAE reaching 23.80 and 18.04 mm. To address this issue, high-accuracy PWV derived from the BeiDou Navigation Satellite System (BDS-PWV), which show high consistency with RS-PWV, were incorporated. A random forest daily-scale water vapor fusion model was developed based on the differential characteristics of dry and wet season residuals. By employing day of year (DOY), latitude, longitude, and elevation as auxiliary factors, the model establishes a seasonal fusion framework that dynamically transitions between dry and wet seasons. Validation shows that the fusion PWV aligns closely with RS-PWV, reducing average RMSE and MAE to 4.71 and 3.81 mm, corresponding to improvements of 80.21% and 78.88% over MODIS, with accuracy increases exceeding 75% at all stations. The fusion model effectively mitigates MODIS’s underestimation and weather sensitivity, producing high-accuracy, spatially continuous daily PWV fields and offering strong potential for improving precipitation and weather forecasting in complex regions such as Hunan Province.

Keywords:

precipitable water vapor (PWV); moderate-resolution imaging spectroradiometer (MODIS); BeiDou navigation satellite system (BDS); random forest; water vapor data fusion model

1. Introduction

Water vapor is one of the most dynamic constituents of the atmosphere, with approximately 99% concentrated in the troposphere. Variations in tropospheric water vapor are closely linked to cloud formation and dissipation, as well as precipitation processes such as rainfall and snowfall, playing a crucial role in their occurrence and evolution [1,2,3]. Monitoring the spatiotemporal variations in water vapor is essential for accurately predicting precipitation timing, location, and intensity [4,5,6]. Moreover, it contributes to improving numerical weather forecasting, enhancing our understanding of the Earth’s climate system, assessing the impacts of climate change [7], and supporting applications in hydrological disaster prevention, mitigation, and water resource management.

The spatiotemporal variations in water vapor are highly complex, making accurate monitoring and forecasting challenging. Precipitable water vapor (PWV) is commonly used to quantify atmospheric water vapor content, representing the total amount of precipitation that would result if all the water vapor in a vertical air column were condensed [8,9]. Traditional PWV detection methods include radiosondes (RSs) [10,11], microwave radiometers [12], remote sensing [13,14,15], and reanalysis datasets [16,17]. RSs carried by weather balloons provide direct vertical atmospheric profiling, offering high-accuracy PWV measurements that serve as reference data for validating other retrieval methods. However, radiosondes are single-use consumables with high operational costs, limited station distribution, and relatively low spatiotemporal resolution, restricting their applicability in large-scale water vapor monitoring. With advancements in remote sensing technology, the satellite-based retrieval of PWV has become widely utilized [18]. Remote sensing imagery, which is easily accessible, leverages spectral absorption differences across various sensor channels to derive continuous, large-scale distributions of PWV with high spatial resolution. More recently, the Global Navigation Satellite System (GNSS) has emerged as a promising data source for PWV retrieval, offering point-based observations with high temporal resolution and minimal susceptibility to weather conditions [19,20]. The BeiDou Navigation Satellite System (BDS), developed and independently operated by China, has gained significant attention for its reliability and performance in PWV retrieval. Ground-based BDS-PWV estimation using the Precise Point Positioning (PPP) method achieves retrieval accuracy comparable to that of the Global Positioning System (GPS), with the mean absolute error (MAE) and root mean square error (RMSE) remaining at the millimeter level, typically within the range of 1–3 mm [21,22], thereby meeting the accuracy requirements for PWV monitoring. In particularly, BDS demonstrates high correlation, low bias, and superior performance in real-time processing and B2b signal applications [23,24].

The Moderate-resolution Imaging Spectroradiometer (MODIS), mounted on the Terra and Aqua satellites, provides stable data acquisition with a long-term observational record spanning from 2000 to the present. As one of the most widely used data sources in current research, MODIS is capable of retrieving column-integrated precipitable water vapor content from the near-surface to the upper troposphere. However, the reliability of MODIS water vapor products remains moderate, with significant retrieval biases, particularly under cloudy and rainy conditions, where underestimation is commonly observed [25]. The accuracy of MOD05 water vapor products exhibits notable regional variations, with significant deviations observed in certain areas. In China, northern regions demonstrate relatively better accuracy, whereas southern regions, especially the southeastern areas, suffer from larger biases and lower accuracy, with RMSE exceeding 10 mm [26]. This systematic underestimation is primarily attributed to cloud interference and limitations of the near-infrared retrieval algorithm, which only yields valid retrievals over near-infrared reflective surfaces [27]. Despite the advantages of MODIS data in terms of long-term observation and spatial continuity, its accuracy varies significantly, particularly in southern China [28]. Therefore, integrating or correcting MODIS data with higher-accuracy water vapor datasets is necessary to obtain high-accuracy, spatially continuous water vapor information. This has become a key research focus in the field of multi-source data fusion for PWV retrieval.

To obtain high-accuracy and spatially continuous PWV data, it is essential to systematically analyze the error characteristics of MODIS-PWV data and to develop robust fusion frameworks that integrate them with high-accuracy PWV observations. At the data processing level, various spatial interpolation methods such as Kriging interpolation and inverse distance weighting (IDW) have been widely used to reconcile the scale mismatch between discrete GNSS station observations and gridded MODIS pixels, thereby enabling unified correction modeling [29,30,31,32]. In terms of modeling strategies, both linear and nonlinear approaches have been proposed to account for the spatiotemporal heterogeneity between GNSS-PWV and MODIS-PWV, often considering seasonal [33] and regional differences [34,35]. This combined strategy of spatial registration and dynamic modeling has demonstrated robustness across various regions, including China and Australia [36]. In recent years, machine learning algorithms, with their strong nonlinear modeling and feature-learning capabilities, have provided a powerful paradigm for multi-source water vapor data fusion. Algorithms such as multiple regression [37,38], random forests [39], neural networks [40,41,42], and XGBoost [43] have been successfully applied to extract complex relationships among multi-source datasets, substantially improving retrieval accuracy. Xiong et al. [44] compared the performance of random forests, generalized neural networks, and BP neural networks in water vapor data fusion at monthly and annual scales, with results indicating that random forests achieved the highest accuracy. Wang et al. [33] integrated BP neural networks, random forests, and multiple linear regression models using MODIS NIR and TIR band information, referencing high-accuracy GNSS-PWV, to enhance water vapor retrieval accuracy under various weather conditions. Xu et al. [45] fused MODIS NIR band data, ERA5-PWV, and GNSS-PWV using BP neural networks, significantly improving PWV retrieval accuracy in high-latitude regions, with RMSE reductions of 81.38% compared to radiosonde data. Numerous studies have demonstrated that constructing intelligent fusion models using GNSS-PWV as a reference, combined with machine learning algorithms, not only effectively mitigates the spatial and temporal resolution limitations of MODIS-PWV but also provides a more comprehensive reference for water vapor monitoring during extreme weather events. In addition, most existing studies focus on monthly or annual scale of PWV fusion, with insufficient attention given to higher temporal resolutions, such as at the daily scale, which is critical for meteorological monitoring and hydrological applications.

Moreover, many previous studies often construct unified fusion models over the entire period, which improves accuracy to some extent but fails to capture the seasonal non-stationarity and residual variability of water vapor. This limitation reduces adaptability to complex wet season when water vapor variability is high and extreme errors are more likely. Motivated by these limitations, this study aims to develop a daily scale, season-aware PWV fusion framework that explicitly accounts for dry-wet seasonal non-stationarity. By integrating high-accuracy PWV derived from the BeiDou satellite system with MODIS-PWV using a seasonal random forest strategy, the proposed approach dynamically adjusts model parameters through seasonal partitioning. This design preserves the spatial continuity of MODIS-PWV while leveraging the accuracy and temporal stability of BDS-PWV, ultimately improving fusion accuracy, robustness and spatial generalization ability across contrasting climatic conditions.

2. Materials and Methods

2.1. Materials

We focus on Hunan Province, China as the study area and integrates four datasets to investigate PWV retrieval: MODIS, BDS, radiosonde data and ERA5.

2.1.1. MODIS

MODIS is on board the Terra (morning overpass) and Aqua satellite (afternoon overpass), enabling twice-daily observations of the Earth and providing data contain 36 spectral channels that cover wavelengths ranging from 0.64 μm to 14.23 μm. MOD05/MYD05 is a level-2 atmospheric product released by MODIS and provided by the Level-1 and Atmosphere Archive & Distribution System Distributed Active Archive Center (LAADS DAAC). It is derived from MOD02L1B/MYD02L1B data through radiometric correction, utilizing infrared or near-infrared algorithms to analyze the transmission differences between water vapor absorption channels (channel 17, 18 and 19) and atmospheric window channels (channel 2 and 5) in the MODIS spectral channels [46]. The product shares the same spatial resolution and coverage area as the L1 data and describes the PWV over global land. We use MOD05 data acquired from the Terra satellite, with a spatial resolution of 1 km. The infrared algorithm is applicable for all-day observations but has lower accuracy, while the near-infrared algorithm is suitable for daytime observations and provides higher retrieval accuracy. Therefore, in our study, the near-infrared algorithm-based MOD05 data product is used to obtain the MODIS-PWV data for the Hunan Province.

2.1.2. BDS

The BDS data were obtained from the Hunan Continuous Operational Reference System (HNCORS), which covers the entire province and consists of 133 stations with an average inter-station distance of approximately 44 km. The network ensures good accessibility to BDS data and is supported by well-established ground infrastructure. In our study, zenith wet delay (ZWD) from 109 HNCORS stations, collected by the Hunan Institute of Geomatics, were utilized for the period from 1 January 2020, to 5 December 2020. The spatial distribution of these stations, shown in Figure 1, demonstrates uniform coverage across the region, except for a relatively sparse distribution in eastern Zhuzhou City. This evenly distributed network provides a solid data foundation, ensuring the reliability of the study results. BDS data, with its high accuracy, high temporal resolution, and low sensitivity to weather conditions, exhibits significant advantages for high-density water vapor monitoring, offering stable and temporally dense data support.

By incorporating the water vapor conversion factor Π, the ZWD is converted into PWV data, which can be expressed as:

PWV = Π \times Z W D

(1)

Π = \frac{10^{6}}{ρ R_{v} (\frac{k_{3}}{T_{m}} + k_{2}^{'})}

(2)

where ρ represents the density of liquid water, kg/m³; R_v is the specific gas constant for water vapor, with a value of 461.495 J/(kg·K);

k_{2}^{'}

and k₃ are atmospheric refractivity constants, valued at 16.48 K/hPa and 3.776 × 10⁵ K²/hPa, respectively; and T_m denotes the atmospheric weighted mean temperature, K. Typically, a regional T_m model is established based on the linear relationship between T_m and surface temperature (T_s) to estimate T_m in areas lacking radiosonde data. The construction process of the atmospheric weighted mean temperature model in our study follows the methodology outlined in the article [47], with the T_m value can be calculated as:

T_{m (H N)} = 0.66 \times T_{s} + 90.33

(3)

2.1.3. Radiosonde

The radiosonde observation data used in the study are sourced from the University of Wyoming’s Atmospheric Sounding Network. The website provides upper-air meteorological sounding station data globally, which is of significant value for water vapor studies. The upper-air meteorological sounding stations release radiosonde balloons carrying radio sounding instruments twice daily (at Universal Time Coordinated (UTC) 00:00 and UTC 12:00) to measure basic meteorological data, including pressure, temperature, dew point temperature, and relative humidity, with a temporal resolution of 12 h. Hunan Province has three upper-air meteorological sounding stations: Changsha Station (RSCS), Huaihua Station (RSHH), and Chenzhou Station (RSCZ), with their spatial distribution shown in Figure 1. The distances between the three stations are considerable, with the Changsha-Chenzhou distance being 257.07 km, the Changsha-Huaihua distance being 281.18 km, and the Huaihua-Chenzhou distance being 359.61 km. Therefore, it is challenging to directly utilize these sounding data for high-temporal and high-spatial resolution water vapor information. However, owing to their high measurement accuracy and well-established reliability, radiosonde observations are used in this study as an independent reference dataset, rather than as a strictly synchronous benchmark, to evaluate the relative retrieval accuracy of PWV derived from other data sources. The PWV retrieval based on radiosonde data is performed using numerical integration methods to calculate RS-PWV, expressed as Equation (4). Radiosonde data are obtained by layer-by-layer detection in the atmosphere at different altitudes, rather than continuous and uninterrupted measurements. As a result, the obtained radiosonde data exhibit discrete characteristics. During the actual calculation process, continuous integration operations need to be discretized to accommodate the data characteristics.

PWV = \frac{\int q d p}{g} = \frac{\sum_{i}^{n} (\frac{q_{i} + q_{i + 1}}{2}) (p_{i + 1} - p_{i})}{g}

(4)

q = \frac{622 \times E}{p - 0.378 \times E}

(5)

where q represents the specific humidity of each atmospheric layer, which is an indicator of the atmospheric water vapor condition, measured in g/kg. p and E represent atmospheric pressure and water vapor pressure, respectively, both expressed in hPa. g represents the local gravitational acceleration, with a value of 980 cm/s².

2.1.4. ERA5

ERA5-PWV data were obtained from the fifth-generation atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). Generated through the assimilation of multi-source observations within a numerical weather prediction framework, ERA5 provides a physically consistent and spatiotemporally continuous representation of atmospheric conditions. The ERA5-PWV data used in this study have a spatial resolution of 0.25° × 0.25° and an hourly temporal resolution, covering Hunan Province over a period consistent with the MODIS observations. It should be noted that ERA5-PWV was not involved in the construction or training of the fusion model; instead, it was used solely as an independent, spatially continuous reference to evaluate the regional-scale spatial distribution and physical consistency of the fusion PWV results.

2.2. Data Preprocessin

2.2.1. Quality Control Principles of MOD05 Data

The accuracy of MODIS water vapor retrieval varies significantly under different climatic conditions, surface characteristics, and atmospheric states. Hunan Province exhibits pronounced seasonality and complex climate variability. According to the China Meteorological Yearbook, the region demonstrates a marked seasonal pattern in cloud dynamics, with an average total cloud cover of 62%, indicating that approximately 62% of the sky is cloud-covered annually. The study systematically evaluates the data quality and application performance of MODIS in retrieving water vapor content over Hunan Province. Due to the variable daily satellite overpass times, which typically occur between UTC 2:00 and UTC 4:00, the region often experiences frequent cloudy conditions during these hours. As shown in Figure 2, the original MODIS-PWV data reveal distinct low-value regions (marked in red), which is a typical manifestation of the systematic underestimation of the retrieval results affected by cloud cover. The underestimation of water vapor is primarily caused by the absorption and scattering of solar radiation in the near-infrared spectrum by clouds, which attenuates the surface-reflected signal and degrades retrieval accuracy.

Although cloud masking can remove part of the contaminated MODIS observations, it substantially reduces the amount of valid data and often results in large areas with no available measurements under cloudy conditions, severely degrading spatial continuity. In addition to cloud contamination, MODIS observations, which represent instantaneous measurements, exhibit pronounced spatial heterogeneity. In typical cases over Hunan, the spatial variation in PWV even exceeds 40 mm, resulting in substantial discrepancies from the actual water vapor distribution. This indicates that the uncertainties in MODIS-PWV stem not only from cloud contamination but also from its inherent instantaneous observation.

In light of these characteristics, the fusion framework developed in this study treats MODIS-PWV as a spatially continuous prior water-vapor field to compensate for the limited spatial coverage of BDS-PWV, rather than relying on its absolute accuracy. Therefore, strict cloud masking was not applied during preprocessing; instead, only outliers were removed to retain the spatial structure of the MODIS observations and enable effective integration with other datasets. The strategy relies on the machine-learning model to learn and correct the systematic biases introduced by cloud contamination and the instantaneous nature of MODIS observations during the fusion phase, thereby enhancing model stability and applicability under cloudy conditions without compromising spatial continuity.

2.2.2. Sample Matching Principles

MODIS provides instantaneous PWV measurements at the time of satellite overpass, which is insufficient to characterize the daily background state of atmospheric water vapor. To investigate the spatiotemporal distribution of PWV at the daily scale, it is therefore necessary to construct daily scale water vapor fields. Given that BDS-PWV has a time resolution of 5 min, it can effectively capture diurnal variations in water vapor and thus serves as a reliable source for representing daily PWV conditions. Thus, the temporal and spatial matching principles followed in the preparation of training samples are as follows: Under rain-free conditions, intraday PWV variability is modest. Although rainfall increases short-term fluctuations, the daily mean remains the most robust and representative indicator of the day-scale water vapor state, effectively avoiding the influence of transient extremes on the fusion model. In the fusion framework, MODIS-PWV primarily provides high-resolution spatial information and regional gradients, whereas BDS-PWV contributes high-accuracy and temporally stable constraints on the constraint. Since MODIS provides instantaneous PWV snapshots that do not capture the daily background, the BDS-PWV daily mean is used to represent the day’s atmospheric water vapor content and is temporally matched with the corresponding MODIS-PWV to construct a stable and reliable daily PWV field; in terms of spatial matching, the CORS station is matched with the mean of a 3 × 3 window centered on the corresponding MODIS pixel.

2.2.3. Description of Dry and Wet Season

In climatological studies, dry and wet seasons are commonly defined based on the seasonal variability of regional precipitation, in conjunction with key atmospheric factors such as precipitable water vapor and large-scale moisture transport [48]. The wet season is typically characterized by enhanced monsoonal moisture transport, frequent precipitation, and a higher atmospheric water-holding capacity, resulting in generally elevated PWV levels, whereas the dry season corresponds to weakened moisture transport, reduced precipitation, and relatively low PWV values. Based on the long-term precipitation statistics and the seasonal characteristics of PWV over Hunan Province, this study defines May to October as the wet season, corresponding to day of year (DOY) 122–305, and November to the following April as the dry season, corresponding to DOY 1–121 and 306–366. Owing to limitations in data availability, the dry-season samples used in this study are restricted to DOY 1–121 and 306–340 in 2020. This classification is intended to capture the pronounced seasonal contrasts in regional atmospheric water vapor conditions.

2.3. Statistical Metrics

To evaluate the accuracy of PWV retrieval, the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (R) are selected as evaluation metrics. Their definitions are listed below:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(6)

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(7)

R = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{X}) (y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{X})}^{2} {(y_{i} - \bar{Y})}^{2}}}

(8)

where x_i, y_i represent the predicted values;

{\hat{y}}_{i}

represents the observed values; n denotes the number of data points;

\bar{X}

and

\bar{Y}

represent the mean values of the two datasets, respectively.

3. Results

3.1. Performance Evaluation of Multi-Source PWV Products

3.1.1. Evaluation of MODIS-PWV

Based on RS-PWV data, our study quantitatively evaluates the accuracy characteristics of MODIS-PWV over Hunan Province. A spatiotemporal matching strategy was employed for data alignment. In the spatial dimension, a 3 × 3 pixels window centered on the MODIS pixel corresponding to the radiosonde station was extracted, and the average value was matched with RS-PWV. In terms of temporal matching, the satellite overpass time does not coincide with the radiosonde observation time. In this study, the instantaneous MODIS-PWV values at the satellite overpass were associated with the RS-PWV observations at UTC 00:00. Although this introduces a potential temporal offset of several hours, its impact is considered limited, as PWV generally varies smoothly over short time intervals outside rapidly evolving convective conditions, making this matching strategy reasonable and widely adopted in satellite PWV validation studies.

Figure 3 illustrates the correlation and residual distribution between the two datasets at the RSCS, RSHH, and RSCZ. Correlation analysis reveals weak relationships between MODIS-PWV and RS-PWV at the three radiosonde stations, with correlation coefficients of 0.4012, 0.3839, and 0.4223, respectively, indicating consistently low levels of consistency. This indicates substantial discrepancies between MODIS-PWV and RS-PWV, with MODIS-PWV generally showing a systematic underestimation. Further analysis of the residual time series distribution reveals pronounced seasonal variation in the differences between MODIS-PWV and RS-PWV, with large fluctuations in residual throughout the year. Notably, significant residuals are observed during wet season, with residuals even exceeding 60 mm, indicating that the retrieval accuracy of MODIS-PWV declines markedly under high-humidity conditions, with underestimation becoming more severe as water vapor increases.

The seasonal difference in residuals is primarily attributed to the contrasting atmospheric conditions between dry and wet seasons. During the dry season, water vapor content is relatively low and cloud interference is minimal, leading to smaller retrieval errors. In contrast, the wet season is characterized by high humidity, extensive cloud cover, and varying surface reflectance, which can interfere with remote sensing signals through cloud masking and atmospheric scattering. As a result, the stability and accuracy of MODIS-PWV decrease significantly, reflecting its high sensitivity and instability during the wet season. Further accuracy assessment results summarized in Table 1 show that the RMSE ranges from 21.00 mm to 26.20 mm, and MAE ranges from 15.85 mm to 19.59 mm, with average RMSE and MAE values of 23.80 mm and 18.04 mm, respectively—all exceeding 15 mm, which further confirms the limited accuracy of MODIS-PWV data over Hunan Province.

Overall, these findings indicate that the applicability of MODIS-PWV in this region is subject to notable limitations, particularly during the wet season. In practical applications, it is essential to account for such seasonal biases and consider the integration of high-accuracy datasets for correction and fusion to improve the overall accuracy of water vapor retrievals.

3.1.2. Evaluation of BDS-PWV

BDS-PWV and RS-PWV are both point-based datasets; however, they differ in temporal resolution and spatial alignment. Therefore, a comparative analysis must be conducted under a unified spatiotemporal framework. To address temporal discrepancies, a time-matching strategy is implemented. Specifically, for each CORS station, the average BDS-PWV values within one hour following the radiosonde launch times are computed and compared with the corresponding RS-PWV values. This approach ensures a reasonable and consistent data alignment for validation. For spatial proximity principle, we select CORS stations located within a 40 km radius of each radiosonde site for BDS-PWV accuracy validation. The corresponding station information is provided in Table 2.

Figure 4 presents the PWV time series for the RSCS-CSKC, RSHH-HHZF, and RSCZ-CZYX station pairs. The results indicate that the BDS-PWV retrievals exhibit a high degree of consistency with the corresponding RS-PWV, demonstrating similar temporal variation trends. This strong correlation suggests that BDS-PWV effectively captures actual PWV variations, confirming the reliability of BDS-PWV retrievals. Based on the characteristics of the PWV time series, it is observed that during the wet season, the atmospheric water vapor content is relatively high, exhibiting more intense fluctuations and strong temporal variability. In contrast, during the dry season, the water vapor content is comparatively low, with smaller fluctuations.

To quantitatively evaluate the accuracy of BDS-PWV, Table 2 provides the RMSE, MAE, and R for each validation station. The results show that the correlation coefficient (R) between BDS-PWV and RS-PWV ranges from 0.9754 to 0.9854, indicating a strong temporal consistency. The RMSE values for all stations range from 2.95 mm to 3.81 mm, while MAE values fall within 2.32 mm to 3.03 mm. The average RMSE and MAE are 3.47 mm and 2.71 mm, respectively. These results suggest that the deviation between BDS-PWV and RS-PWV is minimal, confirming the reliability of BDS-PWV for high-accuracy atmospheric water vapor retrieval. Analysis of the time series indicates that water vapor content is higher during the wet season, exhibiting larger fluctuations and strong temporal variability, whereas it is lower and more stable in the dry season.

3.2. Seasonal Dependence of MODIS-BDS PWV Relationship and Residual Characteristics

Due to significant differences in water vapor variation across months and seasons, constructing a full-season fusion model yields limited improvement for certain periods. Therefore, the daily mean BDS-PWV data are used as the target output to assess the correlation and residual characteristics of MODIS-PWV data. The results indicate that the correlation between BDS-PWV and MODIS-PWV, as well as the residual characteristics, are similar across stations, with R generally ranging from 0.40 to 0.60. This suggests a moderate positive correlation between the two datasets, although the relationship is relatively weak.

For clarity and conciseness, three stations (CDAX, YYSQ, and WCLZ) were randomly selected to demonstrate the correlation and residual distribution, as shown in Figure 5. The results reveal substantial discrepancies between MODIS-PWV and BDS-PWV. During the dry season, the residuals generally range within ±20 mm, indicating relatively stable retrievals and smaller errors. However, in the wet season, MODIS-PWV exhibits significant underestimation, with a wider residual distribution between −60 mm and 20 mm. The results reveal clear systematic differences in water vapor retrieval accuracy between the dry and wet seasons, indicating that a unified full-period model cannot effectively capture their distinct characteristics, particularly limiting performance in the wet season. To address this, a fusion strategy based on the division of dry and wet seasons is proposed. This approach aligns with the physical mechanisms of water vapor transport and mitigates wet season underestimation bias, enabling adaptive optimization of the nonstationary features across seasons.

3.3. Fusion Model Construction

The above analysis indicates that BDS and MODIS PWV data exhibit strong nonstationary and nonlinear characteristics between dry and wet seasons. Accordingly, a random forest-based fusion strategy incorporating seasonal division is proposed. Compared with traditional linear models, the random forest algorithm can flexibly capture complex nonlinear relationships through the integration of multiple decision trees and adaptively learn seasonal variations via its splitting mechanism. Unlike neural networks, RF requires less data and simpler parameter tuning, while its feature importance analysis reveals the differential contributions of influencing factors to PWV fusion, enhancing model interpretability.

Considering the influence of season, elevation, and geographic location on the spatiotemporal distribution of PWV, we leveraged the random forest algorithm’s capability to handle diverse input features. Five parameters are selected as input features: MODIS-PWV, DOY, longitude, latitude, and elevation, while BDS-PWV serves as the reference target used to train the model. The inclusion of DOY allows capturing systematic temporal variation in PWV; longitude and latitude reflect regional water vapor gradients; elevation accounts for topographic effects on atmospheric water vapor distribution. Together, these variables enable the model to learn robust spatiotemporal variations in PWV beyond what MODIS alone can provide.

A station-based cross-validation strategy was adopted in this study. By setting a fixed random seed, the 109 HNCORS stations were randomly divided into training and independent test subsets at a ratio of 8:2, ensuring the reproducibility of the experiments. The fusion model was constructed using all samples from 87 training stations and subsequently evaluated at 22 validation stations that were completely excluded from the training process. This strategy effectively avoids spatial information leakage among stations and provides a more realistic assessment of the model’s spatial generalization ability and robustness, thereby demonstrating the reliability and regional transferability of the fusion in regions not covered by the training data. In addition, a random search strategy was employed for hyperparameter optimization of the random forest model, in which key parameters including the number of trees, maximum tree depth, minimum samples per leaf, and maximum features were systematically explored and the optimal combination was selected based on cross-validation performance. The results indicate that the maximum features and minimum samples per leaf remained at their default values, while the number of trees and maximum tree depth were the primary tuned parameters, with optimal settings of (300, 25) for the dry season and (200, 25) for the wet season, respectively.

The model was trained separately for dry and wet seasons, allowing dynamic adaptation to seasonal residual differences and achieving adaptive weight distribution across seasons. Compared with the traditional full-period static fusion model, this strategy effectively reduces the impact of extreme residuals while maintaining high correlation, enhances seasonal adaptability, improves the accuracy and stability of PWV fusion, and further strengthens the model’s spatial generalization capability in regions. The model construction framework is illustrated in Figure 6.

3.4. Performance Evaluation of the Fusion Model

3.4.1. Validation of Fusion Model Performance Based on Independent Stations

Following the proposed fusion model construction process and data preparation principles, a random forest model was trained. The dataset was randomly divided into training and testing subsets based on the number of stations, with a total of 2975 test samples for the dry season and 3439 samples for the wet season.

To further quantitatively assess the performance of the fusion model under different seasonal conditions and to examine its capability to accommodate seasonal non-stationarity, Figure 7 presents the prediction results of the fusion model for both the dry and wet seasons. The scatter distributions (Figure 7a,c) indicate that, in both seasons, the fusion PWV exhibits a markedly stronger linear consistency with BDS-PWV than the original MODIS-PWV, characterized by substantially reduced dispersion and significantly enhanced correlation. During the dry season, the fusion model achieves an R² of 0.9617, with RMSE and MAE values of 1.67 mm and 1.21 mm, respectively, corresponding to improvements of 90.48%, 87.73%, and 88.68% relative to MODIS-PWV. In the wet season, similarly high performance is maintained, with an R² of 0.9587, RMSE of 2.36 mm, and MAE of 1.74 mm, representing improvements of 99.29%, 91.76%, and 92.30%, respectively. The consistently high R² values indicate stable predictive performance in both seasons, with slightly better accuracy during the dry season.

As shown in Figure 7b,d, residual analysis further demonstrates the stability of the fusion results, as residuals in both seasons are concentrated near zero with relatively narrow distributions. In the dry season, 96.91% of residuals fall within ±4 mm, with only a small number exceeding ±8 mm, indicating strong reliability under relatively stable water vapor conditions, although larger fluctuations occur when PWV ranges from 10 to 30 mm. In contrast, during the wet season, 91.36% of the residuals remain within ±4 mm, while the number of extreme residuals exceeding ±8 mm increases noticeably and the distribution becomes more dispersed, particularly for PWV values between 35 and 65 mm.

3.4.2. Regional PWV Estimation and Accuracy Assessment in Hunan Province

The primary objective of constructing the daily-scale water vapor data fusion model is to generate a dataset that balances high accuracy and spatial continuity. Figure 8 illustrates the spatial distribution of water vapor before and after fusion. Figure 8a,e,i,m depict the discrete BDS-PWV observations together with the spatially continuous PWV fields derived from the inverse distance weighting (IDW) interpolation method. MODIS-PWV (Figure 8b,f,j,n) significantly underestimates PWV, along with noisy pixel interference and spatial inconsistencies. Due to its relatively coarse spatial resolution, ERA5-PWV (Figure 8d,h,l,p) is unable to resolve fine-scale spatial variability of PWV and is therefore used only as a reference for the overall regional distribution pattern.

In contrast, the fusion PWV (Figure 8c,g,k,o) shows significant improvements in both numerical accuracy and spatial distribution, effectively capturing the primary spatial gradients of atmospheric water vapor over Hunan Province, particularly those associated with topographic variability. The fusion PWV demonstrates stronger large-scale spatial consistency with ERA5-PWV while revealing substantially richer spatial details. At the same time, the fusion effectively mitigates the systematic underestimation inherent in MODIS-PWV, enabling the fusion product to retain the spatial continuity of MODIS while achieving numerical consistency closer to that of BDS-PWV.

To further evaluate the accuracy of the BDS/MODIS PWV data, RS-PWV was used for validation at three radiosonde stations in Hunan Province: RSCS, RSHH, and RSCZ, as shown in Figure 9. RS-PWV values were derived from the average of two soundings at each station. The fusion PWV closely matches RS-PWV in both temporal trends and magnitude. As summarized in Table 3, the correlation coefficient between the BDS/MODIS PWV and RS-PWV exceeds 0.95 at all stations. The RMSE ranges from 4.63 to 4.82 mm, and the MAE ranges from 3.74 to 3.89 mm, with mean values of 4.71 mm and 3.81 mm, respectively. Compared to the original MODIS-PWV data, which had an average RMSE of 23.80 mm and MAE of 18.04 mm, the BDS/MODIS PWV data show a significant reduction in errors by 80.21% and 78.88%, respectively, demonstrating strong performance for error mitigation. Despite environmental differences among stations, the variations in RMSE and MAE remain below 0.20 mm and 0.15 mm, respectively, confirming the model’s strong generalization and spatial robustness. Furthermore, the residual reduction rate exceeding 75% across all stations further validates the feasibility and effectiveness of the random forest-based daily-scale fusion model.

3.5. Spatiotemporal Distribution Characteristics of the Fusion PWV

The spatial distribution of PWV in Hunan Province exhibits a pattern of increasing from west to east and from north to south, as shown in Figure 10, which aligns with the region’s climatic characteristics. A long-term analysis of the relationship between topography and PWV revealed a strong negative correlation, with a correlation coefficient of −0.8498. In mountainous and high-altitude areas, such as western Hunan, southern Hunan, and southeastern Hunan, PWV values remain relatively low due to the obstruction of moisture transport by mountain ranges. In contrast, northeastern Hunan, central Hunan, and south-central Hunan, which are characterized by lower elevation and gentler slopes, exhibit a more uniform distribution of water vapor, with relatively higher PWV values. Apart from topographic influences, water systems also significantly affect the spatial distribution of atmospheric water vapor, and serve as critical moisture sources. In Hunan Province, the water systems are primarily concentrated in the northeast, with the Xiang, Zi, Yuan, and Li Rivers traversing the entire province. Elevated water vapor levels are commonly observed in areas adjacent to the water systems, especially in topographically complex mountainous regions.

When PWV is aggregated to the monthly scale, seasonal variability becomes more clearly discernible between the dry and wet seasons. During the dry season, atmospheric water vapor remains relatively low, with peak PWV values generally ranging from 20 to 30 mm and minimum values between 10 and 20 mm. In contrast, the wet season is characterized by a pronounced increase in PWV, with peak values reaching approximately 40–60 mm and minimum values between 30 and 50 mm, highlighting the substantial seasonal contrast in regional water vapor conditions.

The temporal evolution of PWV follows a distinct seasonal cycle:

January-April: PWV shows a gradual upward trend;
May: A rapid increase in PWV is observed;
June-August: PWV reaches and maintains its peak levels, with increased regional spatial differences;
September-October: A gradual decline in PWV occurs;
November-December: PWV continues to decrease, remaining at its lowest levels.

The seasonal variations in PWV are strongly correlated with regional climatic conditions and monsoon circulation patterns. During summer, high temperatures and humidity, combined with the dominant warm and moist southeasterly monsoon, lead to an increase in water vapor content. Conversely, in winter, lower temperatures, reduced evaporation, and the prevalence of dry and cold northwesterly winds result in significantly lower PWV levels. These findings highlight the critical influence of both topography and climate on the spatial and temporal variability of atmospheric water vapor, emphasizing the necessity of incorporating such factors into high-resolution PWV modeling and prediction.

4. Discussion

Seasonal comparison indicates that the proposed fusion model effectively captures the non-stationary characteristics of atmospheric water vapor under dry and wet seasons while maintaining stable performance across varying moisture backgrounds. The fusion PWV exhibits high consistency with RS-PWV throughout the year, demonstrating that the model successfully mitigates the systematic underestimation inherent in MODIS-PWV and aligns the results with high-accuracy ground-based observations. Importantly, even under wet-season conditions characterized by high humidity, enhanced cloud coverage, and pronounced spatial heterogeneity, the fusion results exhibit only minor accuracy degradation and well-controlled extreme residuals. This confirms that explicitly accounting for seasonal variability is critical for improving model robustness under complex wet season, a limitation that is often insufficiently addressed in unified fusion models.

The improved performance can be attributed to the synergistic integration of multi-source observations while explicitly incorporating temporal and geographic controls on water vapor variability. MODIS-PWV provides high-resolution spatial information that enables the model to learn regional-scale spatial gradients, while BDS-PWV delivers accurate and temporally stable measurements that constrain numerical estimates, ensuring spatial detail is preserved without drift, particularly under high-PWV conditions. By combining these complementary characteristics within a season-aware random forest framework, the model effectively suppresses error propagation across contrasting weather conditions. Multi-station validation further confirms strong spatial generalization, indicating that the proposed approach is transferable beyond dense observation networks and suitable for application in observation-sparse regions.

Nonetheless, limitations remain. The model relies on high-accuracy BDS-PWV observations, which may limit performance in areas with sparse station coverage. Moreover, the current framework focuses on daily-scale fusion and does not fully capture sub-daily variability, which is critical for rapidly evolving atmospheric phenomena. These limitations delineate the present applicability boundary of the method rather than undermining its effectiveness. Future efforts could explore multi-sensor integration, including geostationary satellite data and high-resolution numerical weather prediction outputs, to improve temporal fidelity and extend applicability to other regions.

5. Conclusions

The high-accuracy monitoring of precipitable water vapor in both time and space is critically important for applications in meteorology and hydrology. MODIS and GNSS represent two key technologies for obtaining global PWV data; however, they exhibit significant differences in spatial and temporal resolution as well as retrieval accuracy, highlighting their strong complementarity.

We have innovatively developed a retrieval model that fuses PWV data from MODIS and BDS, incorporating DOY and geographic parameters to capture spatiotemporal variability. The proposed model was evaluated over Hunan Province using radiosonde observations and ERA5-PWV as independent reference. Validation results demonstrate that the fusion product effectively combines the high spatial resolution of MODIS with the high accuracy and temporal stability of BDS-PWV, yielding a PWV dataset with both high numerical accuracy and spatial continuity. Compared with the original MODIS-PWV, the BDS/MODIS PWV shows a substantial accuracy improvement, with correlation coefficients exceeding 0.95 relative to RS-PWV and average RMSE and MAE reduced to 4.71 mm and 3.81 mm, corresponding to reductions of 80.21% and 78.88%, respectively. The independent comparison with ERA5-PWV further confirms the spatial consistency of the fusion dataset, indicating that the proposed model effectively mitigates the systematic underestimation and weather-dependent limitations of MODIS-PWV, particularly under wet season, while preserving fine-scale spatial detail constrained by BDS-PWV.

The fusion dataset successfully captures regional-scale water vapor gradients associated with topography and hydrometeorological patterns and exhibits pronounced seasonal variability driven by monsoonal circulation. When aggregated to the monthly scale, the fusion results clearly reflect the contrast between dry and wet seasons, with PWV generally ranging from 10 to 30 mm and from 30 to 65 mm, respectively, demonstrating the model’s capability to adapt to non-stationary seasonal water vapor variations.

In summary, this study advances methodologies for retrieving atmospheric water vapor by integrating BDS and remote sensing observations, promoting the application of the BeiDou system in PWV monitoring, and provided data support for disaster prevention and mitigation applications in the meteorological and hydrological fields. The proposed framework provides a scalable foundation for future investigations of atmospheric water vapor at finer temporal resolutions and in higher-dimensional representations, thereby contributing to improved weather forecasting and climate-related studies.

Author Contributions

Conceptualization, M.S., Z.P. and J.L.; methodology, M.S., Z.P., J.L., W.J. and X.Q.; software, Z.P., J.L. and M.S.; validation, Z.P., J.L. and M.S.; formal analysis, M.S.; investigation, Z.P. and M.S.; resources, Z.P. and M.S.; data curation, Z.P., W.J., X.Q. and M.S.; writing—original draft preparation, M.S.; writing—review and editing, Z.P., J.L., Z.Z. and M.S.; visualization, M.S.; supervision, Z.P.; project administration, Z.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the Open Biddling and Leading Project Foundation of IWHR, grant number JZ110145B0082025; in part by the Beijing Natural Science Foundation-Fengtai Innovation Joint Fund Project, grant number L241046, and in part by the National Natural Science Foundation of China, grant number 42301450.

Data Availability Statement

The dataset analyzed in this study is managed by the China Institute of Water Resources and Hydropower Research. Corresponding authors can be made available upon request.

Acknowledgments

The authors sincerely thanks the team of Dai W. and Chen B. from Central South University for providing data support and assistance during the experimental period.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Archana, S. A Case Study on Variation of Precipitable Water Vapour for Nimbostratus Clouds. Indian J. Sci. Technol. 2015, 8, 1–5. [Google Scholar] [CrossRef]
Abdelwares, M.; Lelieveld, J.; Hadjinicolaou, P.; Zittis, G.; Wagdy, A.; Haggag, M. Evaluation of a Regional Climate Model for the Eastern Nile Basin: Terrestrial and Atmospheric Water Balance. Atmosphere 2019, 10, 736. [Google Scholar] [CrossRef]
Zhao, Y.; Zhang, C.; Wang, S.; Xu, Y.; Yu, H. Trends in Precipitable Water Vapor in North America Based on GNSS Observation and ERA5 Reanalysis. Clim. Dyn. 2025, 63, 94. [Google Scholar] [CrossRef]
Darby, L.S.; White, A.B.; Gottas, D.J.; Coleman, T. An Evaluation of Integrated Water Vapor, Wind, and Precipitation Forecasts Using Water Vapor Flux Observations in the Western United States. Weather. Forecast. 2019, 34, 1867–1888. [Google Scholar] [CrossRef]
Khaniani, A.S.; Motieyan, H.; Mohammadi, A. Rainfall Forecast Based on GPS PWV Together with Meteorological Parameters Using Neural Network Models. J. Atmos. Sol. Terr. Phys. 2021, 214, 105533. [Google Scholar] [CrossRef]
Li, H.; Wang, X.; Wu, S.; Zhang, K.; Fu, E.; Xu, Y.; Qiu, C.; Zhang, J.; Li, L. A New Method for Determining an Optimal Diurnal Threshold of GNSS Precipitable Water Vapor for Precipitation Forecasting. Remote Sens. 2021, 13, 1390. [Google Scholar] [CrossRef]
Kawo, A.; Van Schaeybroeck, B.; Van Malderen, R.; Pottiaux, E. Precipitable Water Vapor in Regional Climate Models over Ethiopia: Model Evaluation and Climate Projections. Clim. Dyn. 2023, 61, 5287–5307. [Google Scholar] [CrossRef]
Duan, J.; Bevis, M.; Fang, P.; Bock, Y.; Chiswell, S.; Budinger, S.; Rocken, C.; Solheim, F.; Hove, T.V.; Ware, R.; et al. GPS Meteorology: Direct Estimation of the Absolute Value of Precipitable Water. J. Appl. Meteorol. 1996, 35, 830–838. [Google Scholar] [CrossRef]
Hocke, K. Inversion of GPS Meteorology Data. Ann. Geophys. 1997, 15, 443–450. [Google Scholar] [CrossRef]
Foelsche, U.; Kirchengast, G. Tropospheric Water Vapor Imaging by Combination of Ground-based and Spaceborne GNSS Sounding Data. J. Geophys. Res. 2001, 106, 27221–27231. [Google Scholar] [CrossRef]
Dembelov, M.G.; Bashkuev, Y.B. Estimation of the Tropospheric Moisture Content Derived from GPS Observations, Radio Sounding Data, and Measurements with a Water Vapor Radiometer. Atmos. Ocean. Opt. 2022, 35, 359–365. [Google Scholar] [CrossRef]
He, Z.; Wang, D.; Qiu, X.; Jiang, Y.; Li, H.; Shu, A. Application of Radar Data Assimilation on Convective Precipitation Forecasts Based on Water Vapor Retrieval. Meteorol. Atmos. Phys. 2021, 133, 611–629. [Google Scholar] [CrossRef]
Ye, X.; Zhu, J.; Zhu, J.; Duan, Y.; Wang, P. Comparison of Nighttime Land Surface Temperature Retrieval Using Mid-Infrared and Thermal Infrared Remote Sensing Data under Different Atmospheric Water Vapor Conditions. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–9. [Google Scholar] [CrossRef]
Zhao, Q.; Ma, Z.; Yin, J.; Yao, Y.; Yao, W.; Du, Z.; Wang, W. General Method of Precipitable Water Vapor Retrieval from Remote Sensing Satellite Near-Infrared Data. Remote Sens. Environ. 2024, 308, 114180. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, X.; Xu, C. Comprehensive Evaluation of the Precipitable Water Vapor Products of Fengyun Satellites via GNSS Data over Mainland China. Atmos. Res. 2024, 300, 107235. [Google Scholar] [CrossRef]
Chen, B.; Yu, W.; Wang, W.; Zhang, Z.; Dai, W. A Global Assessment of Precipitable Water Vapor Derived from GNSS Zenith Tropospheric Delays with ERA5, NCEP FNL, and NCEP GFS Products. Earth Space Sci. 2021, 8, e2021EA001796. [Google Scholar] [CrossRef]
Wang, Z.; Chen, P.; Wang, R.; An, Z.; Yang, X. Performance of ERA5 Data in Retrieving Precipitable Water Vapor over Hong Kong. Adv. Space Res. 2023, 71, 4055–4071. [Google Scholar] [CrossRef]
Chen, X.; Yang, Y.; Liu, W.; Tang, C.; Ling, C.; Huang, L.; Xie, S.; Liu, L. Evaluation and Adjustment of Precipitable Water Vapor Products from FY-4A Using Radiosonde and GNSS Data from China. Atmosphere 2025, 16, 99. [Google Scholar] [CrossRef]
Ding, M.; Ding, J.; Peng, Z.; Su, M.; Sun, T. Developments of Empirical Models for Vertical Adjustment of Precipitable Water Vapor Measured by GNSS. Adv. Space Res. 2025, 75, 2473–2483. [Google Scholar] [CrossRef]
Tine, M.G.; Bosser, P.; Ndiaye, M. Tropospheric Water Vapor Retrievals by Ground-Based GNSS in Africa: A Systematic Review. Rep. Geod. Geoinformatics 2025, 119, 71–84. [Google Scholar] [CrossRef]
Li, M.; Li, W.; Shi, C.; Zhao, Q.; Su, X.; Qu, L.; Liu, Z. Assessment of Precipitable Water Vapor Derived from Ground-Based BeiDou Observations with Precise Point Positioning Approach. Adv. Space Res. 2015, 55, 150–162. [Google Scholar] [CrossRef]
Xu, Y.; Zhao, P.; Wang, J. Precipitable Water Vapor Retrieval for Rainfall Forecasting Using BDS-3 PPP-B2b Signal in the Coastal Region of China. Meas. Sci. Technol. 2024, 35, 116309. [Google Scholar] [CrossRef]
Li, X.; Tan, H.; Li, X.; Dick, G.; Wickert, J.; Schuh, H. Real-time Sensing of Precipitable Water Vapor from BeiDou Observations: Hong Kong and CMONOC Networks. J. Geophys. Res. Atmos. 2018, 123, 7897–7909. [Google Scholar] [CrossRef]
Cao, Y.; Cheng, Z.; Liang, J.; Zhao, P.; Cao, Y.; Wang, Y. Performance of Ground-Based Global Navigation Satellite System Precipitable Water Vapor Retrieval in Beijing with the BeiDou B2b Service. Remote Sens. 2024, 16, 2902. [Google Scholar] [CrossRef]
Liu, H.; Tang, S.; Zhang, S.; Hu, J. Evaluation of MODIS Water Vapour Products over China Using Radiosonde Data. Int. J. Remote Sens. 2015, 36, 680–690. [Google Scholar] [CrossRef]
Shi, F.; Xin, J.; Yang, L.; Cong, Z.; Liu, R.; Ma, Y.; Wang, Y.; Lu, X.; Zhao, L. The First Validation of the Precipitable Water Vapor of Multisensor Satellites over the Typical Regions in China. Remote Sens. Environ. 2018, 206, 107–122. [Google Scholar] [CrossRef]
Zhang, Y.; Li, J.; Li, Z.; Zheng, J.; Wu, D.; Zhao, H. FENGYUN-4A Advanced Geosynchronous Radiation Imager Layered Precipitable Water Vapor Products’ Comprehensive Evaluation Based on Quality Control System. Atmosphere 2022, 13, 290. [Google Scholar] [CrossRef]
Zhao, Y.; Zhao, H.; Li, J.; Xiao, G. Comprehensive Validation and Calibration of MODIS PWV over Mainland China. Atmosphere 2022, 13, 1763. [Google Scholar] [CrossRef]
Chamankar, S.; Amerian, Y.; Naderi Salim, S. Global Positioning System Precipitable Water Vapor Interpolation Using Inverse Multiquadric, Artificial Neural Network and Inverse Distance Weighted. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, X-4/W1-2022, 109–115. [Google Scholar] [CrossRef]
Li, H.; Zhu, G.; Huang, L.; Mo, Z.; Kang, Q. An Improved Method for Developing the Precipitable Water Vapor Vertical Correction Global Grid Model. Atmos. Res. 2024, 311, 107664. [Google Scholar] [CrossRef]
Li, X.; Long, D. An Improvement in Accuracy and Spatiotemporal Continuity of the MODIS Precipitable Water Vapor Product Based on a Data Fusion Approach. Remote Sens. Environ. 2020, 248, 111966. [Google Scholar] [CrossRef]
Yang, H.; Ferreira, V.; He, X.; Zhan, W.; Wang, X.; Ji, S. Spatially Enhanced Interpolating Vertical Adjustment Model for Precipitable Water Vapor. J. Geod. 2025, 99, 12. [Google Scholar] [CrossRef]
Wang, Y.; Jiang, N.; Wu, Y.; Xu, Y.; Kaufmann, H.; Xu, T. An Improved Model for the Retrieval of Precipitable Water Vapor in All-Weather Conditions (RCMNT) Based on NIR and TIR Recordings of MODIS. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–12. [Google Scholar] [CrossRef]
Bai, J.; Lou, Y.; Zhang, W.; Zhou, Y.; Zhang, Z.; Shi, C. Assessment and Calibration of MODIS Precipitable Water Vapor Products Based on GPS Network over China. Atmos. Res. 2021, 254, 105504. [Google Scholar] [CrossRef]
Wang, Y.; Yang, F.; Li, P.; Gong, X.; Liu, M.; Xu, T.; Lin, X.; Wang, Y. An Optimal Calibration Method for MODIS Precipitable Water Vapor Using GNSS Observations. Atmos. Res. 2024, 309, 107591. [Google Scholar] [CrossRef]
Xu, J.; Liu, Z. A Linear Regression of Differential PWV Calibration Model to Improve the Accuracy of MODIS NIR All-Weather PWV Products Based on Ground-Based GPS PWV Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 7929–7951. [Google Scholar] [CrossRef]
Gao, Y.; Lin, J.; Han, J.; Luo, T.; Zhou, M.; Jiang, Z. An Elevation-Coupled Multivariate Regression Model for GNSS-Based FY-4A Precipitable Water Vapor. Remote Sens. 2025, 17, 2371. [Google Scholar] [CrossRef]
Zhu, D.; Zhang, K.; Yang, L.; Wu, S.; Li, L. Evaluation and Calibration of MODIS Near-Infrared Precipitable Water Vapor over China Using GNSS Observations and ERA-5 Reanalysis Dataset. Remote Sens. 2021, 13, 2761. [Google Scholar] [CrossRef]
Ma, Y.; Liu, T.; Yu, Z.; Jiang, C.; Xu, G.; Lu, Z. All-Weather Precipitable Water Vapor Map Reconstruction Using Data Fusion and Machine Learning-Based Spatial Downscaling. Atmos. Res. 2023, 296, 107068. [Google Scholar] [CrossRef]
Zhao, Q.; Du, Z.; Yao, W.; Yao, Y.; Li, Z.; Shi, Y.; Chen, L.; Liao, W. Precipitable Water Vapor Fusion Method Based on Artificial Neural Network. Adv. Space Res. 2022, 70, 85–95. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, X.; Zhang, J.; Xu, C.; Cui, X.; Chen, F. Constructing High-Precision and Spatial Resolution Precipitable Water Vapor Product Using Multiple Fusion Models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 17998–18011. [Google Scholar] [CrossRef]
Lu, C.; Zhang, Y.; Zheng, Y.; Wu, Z.; Wang, Q. Precipitable Water Vapor Fusion of MODIS and ERA5 Based on Convolutional Neural Network. GPS Solut. 2023, 27, 15. [Google Scholar] [CrossRef]
Qin, Y.; Wang, Y.; Zhang, B.; Fang, X.; Yao, Y.; Ma, X. A Novel Model Integrating the Spherical Cap Harmonic Analysis with the XGBoost Algorithm to Improve the MODIS NIR PWV. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar] [CrossRef]
Xiong, Z.; Sun, X.; Sang, J.; Wei, X. Modify the Accuracy of MODIS PWV in China: A Performance Comparison Using Random Forest, Generalized Regression Neural Network and Back-Propagation Neural Network. Remote Sens. 2021, 13, 2215. [Google Scholar] [CrossRef]
Xu, J.; Liu, Z. An Enhanced Algorithm Including First Guess for Deriving Precipitable Water Vapor from MODIS NIR Observations in High-Latitude Regions. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
Gao, B.; Kaufman, Y.J. Water Vapor Retrievals Using Moderate Resolution Imaging Spectroradiometer (MODIS) Near-infrared Channels. J. Geophys. Res. 2003, 108, 2002JD003023. [Google Scholar] [CrossRef]
Sun, M.; Pang, Z.; Zhang, P. Inversion and Precipitation Analysis of Atmospheric Precipitation in the Ground-Based Beidou System of Hunan Province. China Flood Drought Manag. 2024, 34, 28–33. [Google Scholar] [CrossRef]
Duan, W.; Duan, X.; Fan, F.; Sun, J. Climatic Characteristics of Dry and Wet Season in the Southeast Side of the Tibetan Plateau and Its Causes. J. Arid. Meteorol. 2015, 33, 546–554. [Google Scholar]

Figure 1. Spatial distribution of radiosonde stations and HNCORS stations in Hunan Province, China.

Figure 2. Comparison of original and cloud-screened MODIS-PWV over Hunan Province.

Figure 3. Correlation analysis and residual distribution between RS-PWV and MODIS-PWV at radiosonde stations (RSCS, RSHH, and RSCZ). (a–c) Linear regression analysis between RS-PWV and MODIS-PWV, where the blue dots represent paired RS-PWV and MODIS-PWV observations, and the red dashed line representing the 1:1 reference line, indicating high consistency between MODIS-PWV and RS-PWV. (d–f) Residual distribution of MODIS-PWV relative to RS-PWV as a function of DOY, where the blue dots represent the residuals (MODIS-PWV minus RS-PWV), and the red dashed lines indicate the transition between dry and wet seasons.

Figure 4. PWV time series of radiosonde and CORS stations: (a) RSCS and CSKC, (b) RSHH and HHZF, and (c) RSCZ and CZYX.

Figure 5. Correlation analysis and residual distribution between BDS-PWV and MODIS-PWV at three selected stations (CDAX, YYSQ, and WCLZ). (a–c) Linear regression analysis between BDS-PWV and MODIS-PWV, where the blue dots represent paired BDS-PWV and MODIS-PWV observations, and the red dashed line representing the 1:1 reference line, indicating high consistency between MODIS-PWV and BDS-PWV. (d–f) Residual distribution of MODIS-PWV relative to BDS-PWV as a function of DOY, where the blue dots represent the residuals (MODIS-PWV minus BDS-PWV), and the red dashed lines indicate the transition between dry and wet seasons.

Figure 6. Workflow of the daily scale water vapor data fusion model based on the Random Forest algorithm.

Figure 7. Performance evaluation of the daily-scale random forest water vapor data fusion model under dry and wet season conditions. (a,c) Correlation analysis between fusion PWV (BDS/MODIS PWV) and BDS-PWV as well as MODIS-PWV, where the blue dashed line represents the linear regression between BDS/MODIS PWV and BDS-PWV, and the red dashed line represents the linear regression between BDS/MODIS PWV and MODIS-PWV; (b,d) Residual characteristics between BDS/MODIS PWV and BDS-PWV as well as MODIS-PWV, where the red dashed line indicates the zero-residual reference line.

Figure 8. Comparison of daily-scale PWV spatial distribution across four seasons in 2020.

Figure 9. Time series of RS-PWV and fusion PWV (BDS/MODIS PWV) data at radiosonde stations: (a) RSCS, (b) RSHH, and (c) RSCZ.

Figure 10. Monthly spatial distribution of BDS/MODIS PWV from January to December in 2020.

Table 1. Statistical evaluation of MODIS-PWV accuracy compared to RS-PWV at radiosonde stations, including R, RMSE, and MAE.

Radiosonde Station	R	RMSE (mm)	MAE (mm)
RSCS	0.4012	26.20	19.59
RSHH	0.3839	24.20	18.68
RSCZ	0.4223	21.00	15.85

Table 2. Statistical evaluation of BDS-PWV accuracy compared to RS-PWV at validation stations, including R, RMSE, and MAE.

Radiosonde Station	HNCORS Station	Distance to Radiosonde Station (km)	R	RMSE (mm)	MAE (mm)
RSCS	CSKC	23.09	0.9854	3.17	2.46
	SSZF	33.01	0.9837	3.31	2.87
	WCLZ	38.23	0.9813	3.68	2.80
	XTYL	38.62	0.9796	3.81	2.92
RSHH	HHSQ	4.70	0.9784	3.36	2.59
	HHZF	28.84	0.9790	3.36	2.53
	HHZJ	32.87	0.9779	3.47	2.59
RSCZ	CZZX	27.95	0.9754	3.81	3.03
	CZGY	29.89	0.9824	2.95	2.32
	CZYX	36.96	0.9797	3.81	2.96

Table 3. Statistical evaluation of BDS/MODIS PWV accuracy compared to RS-PWV at radiosonde stations, including R, RMSE, and MAE; and the improvement in accuracy of BDS/MODIS PWV relative to the original MODIS-PWV, including RMSE reduction and MAE reduction.

Radiosonde Station	R	RMSE (mm)	RMSE Reduction (%)	MAE (mm)	MAE Reduction (%)
RSCS	0.9809	4.82	81.60	3.89	80.14
RSHH	0.9752	4.63	80.87	3.74	79.98
RSCZ	0.9605	4.69	77.67	3.81	75.96

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, M.; Pang, Z.; Lu, J.; Jiang, W.; Qin, X.; Zhou, Z. Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China. Remote Sens. 2026, 18, 104. https://doi.org/10.3390/rs18010104

AMA Style

Sun M, Pang Z, Lu J, Jiang W, Qin X, Zhou Z. Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China. Remote Sensing. 2026; 18(1):104. https://doi.org/10.3390/rs18010104

Chicago/Turabian Style

Sun, Minghan, Zhiguo Pang, Jingxuan Lu, Wei Jiang, Xiangdong Qin, and Zhuoyue Zhou. 2026. "Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China" Remote Sensing 18, no. 1: 104. https://doi.org/10.3390/rs18010104

APA Style

Sun, M., Pang, Z., Lu, J., Jiang, W., Qin, X., & Zhou, Z. (2026). Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China. Remote Sensing, 18(1), 104. https://doi.org/10.3390/rs18010104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Fusion of BeiDou and MODIS Precipitable Water Vapor Using the Random Forest Algorithm: A Case Study of Multi-Source Data Synergy in Hunan Province, China

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.1.1. MODIS

2.1.2. BDS

2.1.3. Radiosonde

2.1.4. ERA5

2.2. Data Preprocessin

2.2.1. Quality Control Principles of MOD05 Data

2.2.2. Sample Matching Principles

2.2.3. Description of Dry and Wet Season

2.3. Statistical Metrics

3. Results

3.1. Performance Evaluation of Multi-Source PWV Products

3.1.1. Evaluation of MODIS-PWV

3.1.2. Evaluation of BDS-PWV

3.2. Seasonal Dependence of MODIS-BDS PWV Relationship and Residual Characteristics

3.3. Fusion Model Construction

3.4. Performance Evaluation of the Fusion Model

3.4.1. Validation of Fusion Model Performance Based on Independent Stations

3.4.2. Regional PWV Estimation and Accuracy Assessment in Hunan Province

3.5. Spatiotemporal Distribution Characteristics of the Fusion PWV

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI