1. Introduction
Sea fog is a common hazardous weather phenomenon formed when water vapor in the marine boundary layer condenses into tiny liquid droplets or ice crystals, often resulting in horizontal visibility of less than 1 km [
1]. When sea fog occurs, its low visibility significantly disrupts shipping, fisheries, and military operations at sea, underscoring the need for reliable monitoring and early warning. Historically, fog detection relied on sparse observations from coastal stations, ships, and buoys, leaving large areas without coverage.
With the advancement of satellite observation technology, researchers have begun using the radiative differences in passive satellite sensing between fog and clouds for large-scale detection. In 1973, Hunt et al. [
2] first used visible and infrared data to analyze the microphysical properties of water and ice clouds, laying the theoretical foundation for subsequent infrared-based cloud studies. John and Bendix [
3] analyzed ten years of NOAA-AVHRR data to examine fog frequency and developed a daytime fog detection method using visible and mid-infrared channels, later refined with MODIS multi-band reflectance and radiation transmission models to distinguish low cloud from fog. Wu and Li [
4] combined Terra/MODIS data with the two-channel interpolation method (3.7 μm and 11 μm) proposed by Eyer et al. [
5], along with auxiliary parameters such as NDSI and NWVI, to develop an automated sea fog detection algorithm for both day and night, achieving an average accuracy of 0.77. Although such thresholding approaches remain widely used, their performance is highly sensitive to solar angle, surface roughness, and atmospheric conditions, leading to instability across seasons and regions.
A major challenge in fog detection based on passive remote sensing lies in distinguishing fog from low cloud, as both exhibit similar optical and microphysical properties. The primary difference is that fog forms in direct contact with the ocean surface, whereas low cloud has an elevated base. Active sensors, such as CALIOP, can resolve cloud-base height with a vertical resolution of 30 m and have been widely used to identify fog and cloud layers. For instance, Badarinath et al. [
6] applied CALIOP data to study dense fog over the Ganges Plain, India, demonstrating its viability for fog research. In 2015, Wu et al. [
7] combined CALIOP Level-2 VFM products and Level-1 backscatter data for marine fog detection, extracting sample points representing marine fog, low cloud, mid-high clouds, and clear air sea surfaces, which were then applied to MODIS daytime sea fog studies. Given that CALIOP lidar signals attenuate before reaching the fog base, its performance in detecting dense fog is suboptimal. Xiao et al. [
8] studied sea fog over the northwestern Pacific, using the MERRA-2 reanalysis dataset to identify true fog from CALIOP candidate layers based on near-surface meteorological variations across different weather conditions (sea fog, stratus, stratus precipitation). The method was validated against ICOADS and five climate stations, yielding an accuracy rate of 0.89 and a false alarm rate of 0.02. Nevertheless, CALIOP’s narrow swath, 16-day revisit cycle, and strong signal attenuation in dense fog limit its capability for continuous monitoring [
9]. Passive sensors on geostationary satellites, on the other hand, provide wide-area and high-frequency coverage, but cannot directly observe fog-base height, motivating the use of machine learning to extract nonlinear information from multispectral satellite data.
Recent studies have demonstrated the potential of machine-learning methods to enhance fog detection accuracy. Kim et al. [
10] combined GOCI and Himawari-8 data using a Decision Tree (DT) model to identify marine fog over the Yellow Sea, exploiting GOCI’s 412 nm Rayleigh-corrected reflectance and spatial variability indices for fog discrimination, while Himawari-8 infrared data improved cloud removal and edge detection. The method achieved accuracy and hit rates of 0.67 and 0.66, respectively, but its dependence on visible observations limits applicability under certain conditions. Sim and Im [
11] later employed Himawari-8 infrared channels as input features for machine-learning models over the Yellow and Bohai Seas, finding that Channel 15 (less affected by water vapor absorption) contributed most to sea-fog detection. Fu et al. [
12] integrated Aqua/MODIS and CALIOP data to construct a labeled Arctic sea-fog dataset and compared Random Forest (RF), Support Vector Machine (SVM), multi-layer perceptron (MLP), and fully convolutional network (FCN) algorithms, with the FCN achieving the best balance of detection (0.79) and false-alarm (0.25) rates owing to its ability to capture spatial texture features. To address limited maritime annotations, Xu et al. [
13] proposed an unsupervised domain-adaptation framework using terrestrial fog samples and unlabeled oceanic data to generate pixel-level fog masks, achieving high classification accuracy. Despite these advances, most studies still rely primarily on spectral and textural features derived from satellite imagery, often neglecting the physical processes underlying fog formation and development.
To improve the physical validity of sea-fog retrieval, this study integrates previous research on the physical mechanisms underlying sea-fog formation and development [
8,
14,
15,
16]. By combining the feature-extraction capability of machine learning with multi-channel observations from the Fengyun-4 (FY-4) geostationary satellite and meteorological reanalysis data, we construct spectral and physically based features that characterize the multidimensional differences between sea fog, other cloud systems, and clear air conditions.
Previous satellite-based fog studies have primarily focused on the Yellow and Bohai Seas [
4,
7,
8,
10,
11,
13,
17,
18,
19,
20,
21], where fog events occur more frequently, persist longer, and form extensive, continuous layers that are easier for satellites to identify. In contrast, fog over the South China Sea tends to be fragmented, small-scale, and short-lived, strongly influenced by monsoonal circulation, high sea-surface temperatures, and vigorous boundary-layer turbulence [
22,
23,
24]. These factors reduce spatial continuity and detection reliability, making fog recognition over the South China Sea significantly more challenging and thus scientifically valuable.
Furthermore, the termination of CALIOP lidar observations in 2023 and the radiometric discrepancies between FY-4A and its successor, FY-4B, present major obstacles to maintaining operational monitoring continuity. Machine learning models trained on FY-4A data cannot be directly applied to FY-4B because of differences in spectral response functions and calibration coefficients. Therefore, developing a cross-satellite radiometric recalibration and migration strategy is essential to ensure consistency and interoperability between satellite generations.
To address these issues, this study establishes a daytime sea-fog detection machine learning framework for the South China Sea that integrates FY-4A/FY-4B radiometric transfer with physical-mechanism constraints. CALIOP profiles provide single-layer labels (fog, low cloud, mid-high cloud, and clear air), which we spatiotemporally collocate with FY-4 and reanalysis data to construct the spectral/physical input features and Sample dataset. We tune DT, RF, and SVM classifiers using grid search and PSO under stratified 10-fold cross-validation. The resulting models accurately separate fog from clouds and clear air scenes, while FY-4’s continuous wide-area coverage mitigates sparse in situ observations and limited CALIOP overpasses over the open ocean.
3. Results
3.1. Analysis of Candidate Spectral Characteristics
3.1.1. Reflectance
The spectral characteristics of different atmospheric targets were analyzed using clear air, low cloud, single-layer fog, and mid–high cloud samples.
Figure 4 and
Figure 5 present the top of atmosphere reflectance characteristics across multiple spectral bands.
Figure 4 specifically illustrates the interannual variation in mean reflectance values for samples collected in 2019, 2022, and 2023. This variation may result from interannual spectral differences in sea fog and other atmospheric targets or from minor instrumental biases, leading to distinct distributions of mean reflectance and reflectance variability among different samples. Panel (c) in
Figure 4 shows FY-4B AGRI reflectance that has been migrated into the FY-4A radiometric domain using the cross-satellite transfer model (see
Section 2.3.2). Because FY-4A and FY-4B occupy different sub-satellite longitudes, they observe the same target along different optical paths, making a direct channel-to-channel comparison between them infeasible.
Overall, in the 1.375 μm near-infrared band, the differences among the samples are minimal, with mean variations below 0.03 across three years of data. Therefore, the reflectance channel at 1.375 μm is excluded from the feature matrix for the sample dataset. Clear air samples show significant reflectance differences compared to other samples, except in the 1.375 μm channel.
Figure 5 further reveals that single-layer sea fog and mid/low cloud samples exhibit overlapping reflectance spectral characteristics in the visible band (e.g., 0.47 μm, 0.65 μm), near-infrared band (0.825 μm), and shortwave infrared bands (e.g., 1.61 μm, 2.25 μm). This overlap suggests that relying solely on reflectance from a single wavelength band is insufficient for precise classification [
14]. Thus, incorporating additional features, such as radiative differences in thermal infrared channels, is essential to enhance classification accuracy.
3.1.2. Bright Temperature and Bright Temperature Difference
The brightness temperature (BT) and brightness temperature difference (BTD) characteristics were further analyzed to distinguish among different atmospheric targets (
Figure 6). The spectral features show minimal variation between the 3.75 μm high-reflectance channel and the 3.75 μm low-reflectance channel. Clear air samples generally exhibit the lowest brightness temperatures, while mid-high clouds display the highest upper-quartile values, indicating a broad range of brightness temperatures. In terms of median values of cloud/fog, mid-high cloud exhibits the lowest BT, which is lower than that of sea fog and low cloud. Lee et al. [
36] provide a detailed explanation of the spectral differences between liquid clouds (low cloud and sea fog) and ice clouds (mid-high cloud) in the shortwave infrared band (3.9 μm of GOES-8~9) based on satellite imagery, supporting the results of this study regarding the BT distribution of liquid and ice clouds at 3.75 μm.
The 10.7 μm band, located within an atmospheric window, is primarily influenced by CO
2 absorption bands and is less affected by other non-dominant atmospheric gases (e.g., water vapor and ozone). As a result, the BT data received at 10.7 μm predominantly comes from thermal radiation emitted by target objects such as clouds and fog [
37]. Therefore, 10.7 μm BT is commonly used to estimate cloud top height. In clear air regions, with no cloud cover and a well-defined vertical atmospheric structure, radiation received by the sensor mainly originates from the Earth’s surface, leading to higher BT values. In contrast, cloud-covered areas, especially those with mid-high cloud, exhibit lower BT values due to the colder temperatures at the cloud tops. This inverse relationship between BT and cloud top height supports the findings in
Figure 6, where clear air samples show the highest BT, followed by sea fog and low cloud, with mid-high cloud having the lowest values.
The variation in brightness temperature difference primarily reflects the differing absorption characteristics and phase states of ice crystals and water droplets. Most low clouds and sea fog consist of water droplets, with some low clouds potentially containing a mixture of water droplets and ice crystals. Mid-high clouds, on the other hand, are composed of ice crystals [
38], which account for the higher brightness temperature differences observed in mid-high clouds compared to other cloud types.
3.1.3. Texture Characteristics
Sea fog in the South China Sea is primarily advective fog, characterized by sea surface air temperatures exceeding SST. This occurs through eddy diffusion, where warm, moisture-saturated air from upper layers is transported to the cooler sea surface [
39]. Compared to other cloud types, sea fog exhibits more uniform and smoother textural characteristics, which manifest as lower BT variability in the BT channel. In this study, the 10.7 μm channel within the atmospheric window is used to explore thermal radiation differences between cloud and fog targets, thereby assessing textural smoothness [
4].
To further illustrate these textural differences,
Figure 7 shows the standard deviation of BT at 10.7 μm (STD BT10.7) within 2 × 2 spatial grid cells for sea fog, clear air, and other cloud types. Given that the spatial resolution of each grid cell in the FY-4 satellite is 4 km, this study examines the textural variations of cloud and fog features within a 64 km spatial range. The distribution indicates that the standard deviation of BT for sea fog is much lower than that of mid-high cloud. Furthermore, sea fog data predominantly fall within a lower standard deviation range compared to other cloud systems, suggesting that thermal radiation in sea fog remains relatively stable within a 64 km spatial range, with highly uniform texture. In contrast, other cloud types exhibit greater variability and irregularity in texture, providing a useful criterion for distinguishing sea fog from other clouds.
3.1.4. Auxiliary Parameters
The Normalized Difference Snow Index (NDSI) distinguishes snow cover from other land surfaces by measuring reflectance in the visible and shortwave infrared bands [
40]. Mid-high clouds, which are typically composed of ice crystals due to their higher altitudes, exhibit spectral characteristics similar to those of snow [
41]. Due to scattering effects, mid-high clouds have higher reflectance in the visible spectrum than other cloud types, making them appear brighter in true-color imagery, while they show lower reflectance in the shortwave infrared band.
Based on the difference between the 0.47 μm visible reflectance and 2.25 μm near-infrared reflectance of the target, clear air samples exhibit the highest NDSI distribution and median values. Mid-high clouds have a relatively high median, while low clouds and sea fog exhibit similar medians, as shown in
Figure 8. The NDSI calculation formula is as follows:
Based on the characteristic difference between the 0.47 μm visible reflectance (R0.47) and 2.25 μm shortwave infrared reflectance (R2.25), the NDSI provides a powerful criterion for cloud phase discrimination. In this study, clear air samples exhibit the highest NDSI distribution and median values, followed by mid-high clouds, while liquid-phase low clouds and sea fog exhibit similar, lower medians (
Figure 8). Following the established index definitions for ice-phase characterization [
40], the NDSI is calculated as follows:
As thermal infrared channels primarily receive radiant energy from the emissions of target objects, and the emissivity of clouds and fog in the longwave infrared window (10.7 μm) is approximately 1, observations from this channel are highly effective for estimating the cloud/fog top temperature. Due to the vertical temperature gradient in the troposphere, the BT at 10.7 μm (BT10.7) typically decreases as the cloud top altitude increases. Following the methodology of Garand [
42], which defines cloud height features based on the temperature contrast between the cloud top and the underlying surface, the cloud/fog top height (H) can be estimated using the SST and the observed BT10.7. By applying the standard atmospheric lapse rate, γ = 0.65 °C/(100 m) (indicating a temperature decrease of 0.65 °C for every 100 m increase in altitude), the target altitude is determined by the difference between the cloud top BT10.7 and the corresponding clear-air SST of the underlying layer. The calculation formula for the fog top height is as follows:
Obtaining BT data for clear air sea surfaces under cloud cover is challenging. Therefore, this study employs ERA5 reanalysis data to derive SST along the CALIPSO scan tracks. As shown in
Figure 9, under clear air conditions, the radiant energy received by the 10.7 μm channel primarily comes from terrestrial thermal radiation, which is substantial and minimally influenced by cloud cover. This results in a relatively high median position in the box plot, with a concentrated data distribution. In contrast, the impact of sea fog and other cloud systems on the 10.7 μm BT primarily originates from the fog/cloud top rather than the land surface. This leads to a reduction in the temperature difference value (BT10.7–SST), with the reduction being smallest in sea fog, followed by low cloud, and mid-high cloud. Among these, mid-high clouds exhibit the lowest average BT10.7–SST value, reaching approximately −30 °C, which aligns with the physical principle that BT decreases with increasing cloud top height.
3.2. Analysis of Key Parameters in the Physical Mechanism
Sea fog in the South China Sea primarily consists of advection-cooled fog, and its formation is closely linked to favorable wind direction and speed, a stable atmospheric structure maintained by temperature inversions, and high relative humidity [
8,
14,
43]. This study examines key physical quantities related to sea fog formation, including meridional and zonal wind components at 10 m height, air temperature at various vertical layers, and relative humidity. The statistical distribution of wind, temperature inversion, and humidity are reflected in
Figure 10.
During the formation and maintenance of sea fog, wind speeds are generally low, typically under 4 m/s, facilitating heat exchange between warm, moist air and the cold sea surface, which leads to fog formation [
14]. From the 10 m zonal and meridional wind fields, the median zonal wind of sea fog is about −1 m s
−1 (easterly), and the median meridional wind is about −2.5 m s
−1 (northerly). The zonal wind of sea fog is weaker than that of clear air, while its meridional wind is stronger. Overall, the wind speed of sea fog is lower than that of other cloud systems, consistent with the conditions for sea-fog formation and maintenance.
The presence of a temperature inversion layer is another critical factor. The inversion confines water vapor within a specific vertical range, creating a stable cold atmospheric layer that reduces turbulent mixing. This enhances heat exchange between warm, moist air and the cold sea surface, thereby promoting the formation and persistence of sea fog. For example, sea fog along the US West Coast is characterized by a strong inversion within the boundary layer [
16].
In the South China Sea, sea fog is typically shallow, and the base of the capping inversion is frequently observed within the lower few hundred meters (approximately 180–400 m). In some warm-advection cases, fog may interact with overlying stratus, and the cloudy layer can deepen to exceed 1000 m [
23,
39,
44]. Given these characteristics and the available vertical nodes in the auxiliary meteorological profiles, we quantify inversion strength using the temperature contrast between 1000 m and 50 m, defined as T1000m−T50m. This metric serves as an effective proxy for lower-tropospheric stability relevant to fog maintenance: stronger positive values indicate a more stable stratification with weaker lapse rates, which limits vertical moisture transport and helps sustain the shallow fog layer. In our dataset, sea fog exhibits markedly stronger inversion strength than low cloud, mid–high cloud, and clear-air conditions, consistent with a more stable near-surface structure that inhibits vertical mixing and moisture dilution.
Water vapor, represented by relative humidity (RH), also plays a key role in sea fog development. Higher relative humidity favors fog formation [
14], particularly under stable boundary-layer conditions. Importantly, the vertical structure of RH, rather than a single-level value, provides discriminative information for separating sea fog from stratiform clouds. Previous studies have demonstrated that the vertical RH gradient is an effective discriminator between sea fog and low stratus, with sea fog typically exhibiting very high near-surface RH and a rapid decrease with height, while stratus tends to show a relatively higher RH maximum aloft (e.g., around ~1 km) [
8]. In this study, RH_50m serves as the representative variable for near-surface humidity. Observations indicate that at 50 m, single-layer fog has a higher relative humidity, with median relative humidity values above 85%, which is conducive to sea fog formation. At 1000 m, the relative humidity of sea fog begins to decrease, while that of low cloud increases. At 2000 m, both sea fog and low cloud exhibit reduced relative humidity, consistent with the structural characteristics of sea fog, which typically has a thickness of about 300–400 m and a relatively low fog-top altitude [
23]. However, medium and high clouds mostly have high relative humidity at 2000 m.
3.3. Validity of the Transfer Model
To verify the effectiveness of the radiometric transfer model in mitigating domain shifts between sensors, we constructed a representative independent test set (
N = 264) covering different mission stages. Specifically, we selected samples from 2019 (early FY-4A phase) and 2022 (late FY-4A phase) together with 2023 (January–April, FY-4B initial operational phase) to rigorously evaluate the transfer capability across varying instrument states. The 21-dimensional feature matrix (detailed in
Section 2.3.3) served as the input.
Table 7 presents a comparative evaluation of the DT, RF, and SVM models under two experimental settings: the pre-migration baseline (using native FY-4B radiances directly) and the post-migration framework (using FY-4B radiances transferred to the FY-4A). The results demonstrate that the radiometric transfer significantly improves detection performance across most categories (clear air, low cloud, and sea fog). For the sea fog category, POD increased from 0.74 to 0.77 for DT, from 0.62 to 0.73 for RF, and from 0.71 to 0.73 for SVM. The F1-score increased from 0.70 to 0.74 for DT, from 0.71 to 0.79 for RF, and from 0.73 to 0.77 for SVM. A slight decrease was observed for mid-high cloud, likely because the transfer model primarily emphasizes surface-related radiometric features and lacks sufficient representation of upper-level cloud characteristics above approximately 2 km (detailed discussion in
Section 4.1). Taken together, the results support the practical value of a radiometric transfer framework that combines machine learning with physical constraints for sea fog detection over the South China Sea.
3.4. Features Optimization and Ablation Experiment
To assess individual feature contributions and handle potential redundancy among correlated spectral channels, we conducted a group-wise ablation analysis and a Sequential Forward Selection (SFS) using the fixed stratified train–test split spanning the period from 2019 to 2023. The baseline model (“Baseline (All)”) refers to the RF trained with the complete 21-feature set, including: (i) optical reflectance bands (ref0.47, ref0.65, ref0.825, ref1.61, ref2.25), (ii) thermal infrared brightness temperatures (bt10.7, bt3.75H, bt3.75L), (iii) brightness temperature differences (DBT10.7–12, DBT8.5–10.7, DBT3.75L–10.7), (iv) derived indices (NDSI0.47/2.25, STDbt10.7, BT10.7–SST), (v) physical-mechanism parameters (surface_wind_speeds_10m_u, surface_wind_speeds_10m_v, temperature_inversion, RH_50m, RH_1000m, RH_2000m), and (vi) the Year variable.
The group-wise ablation results (
Table 8) illustrate how different feature groups influence detection performance. Removing the physical-mechanism (meteorological) parameters leads to the largest degradation, confirming that these variables provide the primary physical constraints by characterizing advection (10 m wind components), boundary-layer stability (temperature inversion), and near-surface moisture structure (multi-level RH). In quantitative terms, the absence of this group reduces Fog-F1 from 0.8440 to 0.7619 (ΔFog-F1 = −0.0821), accompanied by a substantial drop in Fog-Recall (0.7881 to 0.6887; ΔRecall = −0.0993), Fog-CSI (0.7301 to 0.6154; ΔCSI = −0.1147), and Macro-F1 (0.9044 to 0.8461; ΔMacro-F1 = −0.0583).
By contrast, removing optical reflectance increases Fog-Recall slightly (ΔRecall = +0.0132) but decreases Fog-F1 (ΔFog-F1 = −0.0180), indicating that reflectance cues mainly help suppress false alarms and improve precision. Removing BT (absolute brightness temperatures) yields small improvements in Fog-Recall (ΔRecall = +0.0199) and Fog-CSI (ΔCSI = +0.0049) with negligible change in Fog-F1 (ΔFog-F1 = −0.0066) and a slight gain in Macro-F1 (ΔMacro-F1 = +0.0023), suggesting complementary information that is partly redundant with other spectral/thermal predictors. In contrast, removing BT-difference features or indices causes small but consistent declines in Fog-F1 and Fog-CSI, implying that these derived variables contribute additional discriminative power beyond raw BT/reflectance alone. Finally, removing Year produces only marginal changes, indicating that temporal information provides limited incremental benefit for this dataset under the current sampling strategy.
Given these outcomes and the goal of reducing spectral redundancy, we further performed a within-group refinement using SFS with the six meteorological parameters fixed as a baseline. A feature was retained only if it improved Sea Fog F1, CSI, or Macro F1 by >0.1%. This process yielded a compact subset of seven representative predictors: ref0.65, bt10.7, BT10.7–SST, ref1.61, DBT8.5–10.7, ref2.25, and STDbt10.7. Evaluations using the same RF configuration confirm that the SFS-refined feature set improves both fog detection sensitivity and overall balance. Specifically, fog recall increases by 2%, fog F1 increases by 1.5%, and fog CSI increases by 2.3%, while macro-average F1 also improves by 0.69% (
Table 9), outperforming the full 21-feature baseline. Based on these results, the SFS-refined 13-feature configuration is adopted as the main feature set for subsequent model comparisons.
To better understand the decision logic of the optimized model, we analyzed the permutation importance for the SFS-refined 13-feature RF model based on Fog F1 (
Figure 11). The results show that ref0.65 provides the largest contribution, indicating that visible reflectance is a key discriminator for fog and cloud/clear air scenes through their distinct optical thickness and scattering characteristics. Thermal variables also play an important role. BT10.7 and the temperature contrast term BT10.7 minus SST quantify the relationship between cloud top brightness temperature and the underlying sea surface, which is physically consistent with the near-surface and shallow nature of sea fog layers and helps separate surface-attached fog from elevated clouds. The contribution of ref2.25 further supports the utility of shortwave infrared sensitivity to droplet size and liquid water content for distinguishing fog from higher cloud types.
Meteorological variables, including multi-level relative humidity and temperature inversion, especially RH_1000m, act as the fundamental signal for object detection, which shows the importance of vertical humidity stratification and thermodynamic stability in distinguishing sea fog. Horizontal wind characteristics together with other physical variables show moderate but consistent importance, reflecting that they mainly constrain the classification to physically plausible environments for fog occurrence. It is worth noting that permutation importance measures the marginal contribution of individual variables, whereas the group-wise ablation captures the joint impact. The ablation results reveal physical mechanism parameters primarily through their mutual interactions to impose environmental constraints on sea fog formation, rather than acting as independent predictors. This explains why individual meteorological features appear less dominant in the permutation ranking. For completeness, we also report the Random Forest impurity-based feature importance in
Appendix A.
3.5. Performance Comparison of Machine Learning Models
Based on the SFS-refined 13-feature subset and optimized hyperparameters (RF: n_estimators = 340; SVM: C = 57.1, Gamma = 0.65; DT: max_depth = 14),
Table 10 presents the final comparative evaluation. With the optimized configuration applied, RF achieved the highest overall accuracy at 0.91, followed by SVM at 0.87 and DT at 0.86. This pattern is consistent with the ensemble advantage of RF in handling complex spectral-physical interactions. For the clear air category, RF performed best with an F1-score of 0.92, while DT and SVM obtained 0.88 and 0.84, respectively. For low cloud, both RF and SVM maintained strong recognition performance, reaching F1-scores of 0.93 and 0.92, while DT remained lower at 0.88. For mid-high cloud, RF and SVM yielded comparable F1-scores of 0.94, outperforming DT (0.90).
For the sea fog category over the South China Sea, all three models showed different detection abilities. The F1 scores for sea fog were 0.76 (DT), 0.85 (RF), and 0.80 (SVM), with POD (Recall) of 0.74 (DT), 0.81 (RF), and 0.80 (SVM), respectively. The RF model exhibited the highest performance in both precision (0.90) and recall, indicating it successfully detected the majority of sea fog events while effectively minimizing false alarms. However, the F1-score of sea fog was slightly lower than for the other classes, which reflects the more complex radiative and physical characteristics of sea fog.
Comparative evaluation of sea-fog detection studies provides insight into how regional environments, satellite sensors, and labeling strategies influence model performance. Significant variability exists among regions due to atmospheric and oceanic conditions, fog morphology, and differences in validation methodology. Representative studies and their performance metrics are summarized in
Table 11. Comparative evaluation indicates that the optimized RF model offers improved detection capabilities for the South China Sea. Specifically, relative to existing studies in this region (POD 0.71–0.76), our model achieved a POD of 0.81 and an F1-score of 0.85. However, it is important to acknowledge that this performance still presents a notable gap compared to the advanced benchmarks achieved in the Yellow and Bohai Seas (where PODs can reach 0.94). This disparity is largely attributed to the greater environmental challenges in our study area compared to the northern seas.
Fog characteristics differ substantially between the two regions: Yellow and Bohai Sea fog events occur more frequently, persist longer, and form extensive, continuous layers, which makes them easier for satellites to identify. In contrast, fog over the South China Sea tends to be fragmented, small-scale, and short-lived, posing greater difficulty for satellite-based retrieval. Furthermore, differences in satellite sensors affect performance evaluation: polar-orbiting satellites generally provide higher observational accuracy due to finer spatial resolution, whereas geostationary satellites such as FY-4A/B offer higher temporal coverage but coarser resolution (4 km), leading to slightly reduced detection precision.
It is also important to note that variations in ground-truth extraction and validation methods contribute to discrepancies among studies. For instance, sea-fog labels derived from ground-station visibility records are typically validated using the same type of station data, while those extracted from CALIOP active-lidar observations are verified using CALIOP. CALIOP-based labels usually represent single-layer fog or cloud, whereas station-based labels do not distinguish vertical structure, which can further influence accuracy metrics.
Overall, the validation results demonstrate that the FY-4B geostationary satellite effectively fills the observational gap in sea-fog monitoring over the South China Sea, providing slightly improved performance compared with FY-4A. These findings confirm the potential of FY-4B multi-source fusion with physical-mechanism parameters and machine learning optimization for operational, high-resolution sea-fog detection in this region.
To evaluate the classification performance of each model, confusion matrices were constructed for the three machine learning algorithms, as shown in
Figure 12. During the classification of single-layer fog, all models predominantly misclassified it as either clear air or low cloud. The RF model correctly classified 122 instances of single-layer fog, misclassifying 17 as clear air, 6 as mid-high cloud, and 6 as low cloud, resulting in a recall of 0.81. The 17 instances misclassified as clear air likely correspond to optically thin fog or fragmented fog edges. In these scenarios, the satellite sensor receives significant signal contributions from the underlying dark ocean surface, causing the spectral signature to resemble clear-sky water rather than a bright fog layer.
Although the spectral characteristics of low cloud and sea fog are very similar, all three models misclassified fewer than or equal to 8 of the 151 low cloud test samples as sea fog. Additionally, the RF and SVM models misclassified only 6 and 5 fog samples as low cloud, respectively. This suggests that the introduction of physical constraints (e.g., BT10.7–SST and RH_1000m) contributes positively to differentiating surface-contacting fog from elevated stratus, although some spectral ambiguity inevitably persists.
It is also worth noting that while SVM achieves a comparable recall to RF, it struggles significantly with False Positives at the clear-air boundary. As shown in the confusion matrix, SVM misclassified 24 clear air samples as sea fog, whereas RF misclassified only 5. This indicates that the RF model is far more robust in correctly rejecting clear-sky scenes, maintaining high precision while ensuring sensitivity.
3.6. Case Analyses
To evaluate the operational robustness and generalization capability of the final optimized model, we retrieved sea fog cases over the South China Sea using the classifier (DT, RF, and SVM) configured with the optimal 13-feature subset determined in the ablation study (
Section 3.4). Spectral features from FY-4B (2023–2025) were integrated with reanalysis for multi-source fusion retrieval. This approach was further supported by visibility data from the National Oceanic and Atmospheric Administration (NOAA, Asheville, NC, USA) Integrated Surface Dataset (ISD) (
https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00532/html, accessed on 19 May 2025) and auxiliary assessments from visibility monitoring stations in Zhanjiang, China.
Figure 13 illustrates the retrieval results of three representative sea-fog cases over the South China Sea. In this region, sea fog generally forms when near-surface visibility drops below 1 km. However, because most coastal ground observation stations are located some distance inland from the shoreline, the visibility threshold used to identify sea fog can be appropriately relaxed in practice.
On 14 March 2025, sea fog with visibility below 1 km occurred near Zhanjiang, with the Beibu Gulf region experiencing visibility below 5 km. Satellite imagery showed thin cloud cover with a dark hue and uniform texture, with some areas of the sea surface faintly visible. The DT, RF, and SVM models all successfully identified sea fog in the Beibu Gulf area. For the coastal sea fog near Zhanjiang, DT and RF achieved effective detection, while the SVM labeled the core as low cloud but successfully identified sea fog along the surrounding cloud edges.
On 15 March 2024, and 6 February 2023, coastal Zhanjiang again experienced sea fog with visibility below 1 km. In the 2024 case, DT and RF identified the sea fog event. Satellite imagery showed bright, fluffy clouds with a certain thickness above the fog, and DT, RF, and SVM classified portions of these clouds as low clouds. In the 2023 case, RF mostly misclassified it as low cloud but identified a small portion as sea fog, and the corresponding satellite imagery revealed sea fog surrounded by darker clouds, with a gap exposing the sea surface and thin mist. This distribution closely matched RF’s retrieval. Additionally, RF identified sea fog below low cloud in the Beibu Gulf with visibility below 1–5 km. This may reflect the potential for low cloud and sea fog to transition during formation, so sea fog beneath low cloud may have existed. The thin cloud distribution in the satellite image aligned more closely with RF’s retrieval.
In summary, the RF model demonstrated the most accurate and detailed classification of clouds, fog, and clear air, particularly excelling in distinguishing low clouds from sea fog. The DT model identified the most sea fog pixels, but their distribution was often fragmented. The SVM model exhibited the highest spatial continuity, but tended to misclassify small, dispersed sea fog patches as homogeneous areas, overlooking detailed boundaries. Under multi-layer cloud cover (e.g., mid/high clouds obscuring low clouds), all models performed less effectively in sea fog detection. However, they performed better under single-layer cloud/fog conditions.
4. Discussion
This study demonstrates the efficacy of a machine learning framework integrating cross-satellite calibration and physical constraints for daytime sea fog detection over the South China Sea. However, several challenges and future directions warrant discussion.
4.1. Model Generalization and Operational Robustness
Regional complexity poses a primary challenge to model generalization. Physical understanding of sea fog is still incomplete, and the Beibu Gulf, Qiongzhou Strait, and Taiwan Strait differ in prevailing synoptic regimes and microphysical properties. The lack of an objective, physics-based classification standard for sea fog samples from these distinct maritime areas may currently constrain the model’s generalizable discriminative power. To address this, future research needs to delve deeper into the region-specific formation mechanisms and incorporate more refined physical parameters, such as air–sea temperature difference and stability indices (lifting condensation level, vertical wind), to improve regional adaptability. Furthermore, while the current framework prioritizes the composite BT10.7–SST feature to capture the vertical thermal contrast, the role of raw SST remains fundamental and warrants sensitivity experiments under varying seasonal conditions.
Beyond regional variability, operational robustness is influenced by sample size and spatial resolution constraints. The relatively small number of sea fog samples is primarily due to the narrow swath of the CALIOP lidar and the strict screening criteria required to match the 4 km resolution of FY-4 AGRI. While the sample size is limited, the integration of physical mechanism constraints (temperature inversion, boundary layer humidity, and wind field) significantly enhances the model’s generalization capability. Unlike purely data-driven methods that rely solely on spectral textures, these physical parameters represent the universal thermodynamic conditions required for sea fog formation, ensuring robust detection even with limited training data. However, a trade-off exists regarding spatial scale; the 4 km resolution of the AGRI sensor limits the detection of fragmented or small-scale sea fog patches. Techniques such as super-resolution or downscaling offer potential pathways to better capture these fine-scale events.
Another significant detection barrier arises from the inherent limitation of passive remote sensing to penetrate thick cloud layers. Our case analyses confirm that under multi-layer, thick-cloud conditions, the satellite’s spectral signal is dominated by the radiative properties of the uppermost cloud layers. This vertical limitation also constrains the applicability of ground-based quantitative validation. While coastal stations and buoys provide reliable surface visibility records, they cannot characterize the atmospheric structure above the fog. In ‘fog-under-cloud’ scenarios, a station reports fog, but the satellite sensor captures the upper cloud signal. Validating against such data without vertical profiling would introduce discrepancies where the model is penalized for correctly identifying the upper cloud layer, contradicting our training focus on single-layer detection. Consequently, model performance is superior under single-layer or thin-cloud conditions. Future research should focus on quantifying the radiative differences between overlapping cloud layers and underlying fog to enhance detection capability within complex cloud systems.
Finally, advanced deep learning is a powerful technique for improving sea fog detection accuracy. Architectures such as U-Net, C-GAN, or other hybrid models can capture more multiscale spatial context and texture. Building on the effectiveness of the transfer model to address the data continuity challenge after CALIOP data ended in 2023, our subsequent research will extend the current along-track labels into spatial annotations. This will allow for a systematic comparison of sea fog identification accuracy using various deep learning architectures. The ultimate objective is to operationalize a robust FY-4B sea fog monitoring product that provides high-resolution support for fog events over the South China Sea.
4.2. Radiometric Transfer Model
The slight decline in mid-high cloud classification accuracy following the radiometric transfer can be attributed to complex radiative interactions that are challenging for a global empirical transfer model to fully capture.
Physically, mid-high clouds exhibit complex interactions with solar and thermal radiation, including multi-layer scattering, variable optical depths, and phase-dependent (ice vs. liquid) absorption/emission properties [
2,
38]. Unlike liquid droplets, ice crystals in high clouds are non-spherical, inducing complex scattering phase functions that create non-linear spectral signatures [
49]. In our MODTRAN-based simulations, while typical cloud parameters were incorporated, the model’s reliance on global empirical fitting tends to smooth over these localized non-linearities.
Radiometrically, as noted in satellite intercalibration studies, the Spectral Band Adjustment Factor is highly dependent on the target’s spectral signature [
50]. Our MODTRAN simulations prioritized surface and low-level diversity to ensure global robustness, which minimizes the overall error across diverse surface types (e.g., ocean, land), and often fails to capture the specific spectral shifts caused by the complex microphysics of high-altitude ice clouds. Specifically, high clouds like cirrus often possess semi-transparent properties, leading to the partial transmission of underlying surface radiation [
37]. This introduces significant variability in the simulated radiance that, when convolved with sensor-specific Spectral Response Functions, amplifies discrepancies in brightness temperatures.
To address these limitations, our future work will focus on integrating advanced cloud microphysics from models like RTTOV and developing scene-adaptive transfer functions that dynamically adjust coefficients based on cloud types.