Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II

Kang, Jonggu; Miyazaki, Hiroyuki; Kim, Seung Hee; Kafatos, Menas; Kim, Daesun; Kim, Jinsoo; Lee, Yangwon

doi:10.3390/rs18010034

Open AccessArticle

Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II

by

Jonggu Kang

¹,

Hiroyuki Miyazaki

²

,

Seung Hee Kim

³

,

Menas Kafatos

³,

Daesun Kim

⁴

,

Jinsoo Kim

¹

and

Yangwon Lee

^1,*

¹

Major of Geomatics Engineering, Division of Earth and Environmental System Sciences, Pukyong National University, Busan 48513, Republic of Korea

²

Global Data Lancers (GLODAL) Incorporation, Yokohama 231-0062, Japan

³

Institute for Earth, Computing, Human and Observing (ECHO), Chapman University, Orange, CA 92866, USA

⁴

Ocean Law Research Department, Korea Institute of Ocean Science and Technology (KIOST), Busan 49111, Republic of Korea

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(1), 34; https://doi.org/10.3390/rs18010034

Submission received: 2 November 2025 / Revised: 16 December 2025 / Accepted: 19 December 2025 / Published: 23 December 2025

(This article belongs to the Section AI Remote Sensing)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The integration of data from the GK2A AMI and GK2B GOCI-II significantly enhances the accuracy of sea fog detection around the Korean Peninsula.
Key factors enhancing the reliability of this detection are deep learning-based image co-registration and the appropriate optimization of SOTA Transformer models.

What are the implications of the main findings?

Comparisons with the officially operational GK2A AMI Fog and GK2B GOCI-II MF products revealed that our deep learning approach was superior to both existing products. The proposed model effectively mitigated both under- and overestimation issues by comprehensively learning diverse spectral and spatial patterns. Therefore, the deep learning approach should be actively considered for advancing the currently operational sea fog products.
To maximize the advantages of deep learning-based image co-registration and optimized Transformer models, a richer learning database must be constructed to cover a variety of sea fog cases. Oceanic and meteorological contexts (such as temperature, humidity, cloud cover, and advection) should be incorporated as auxiliary data. Furthermore, image augmentation techniques utilizing GAN and diffusion models should also be developed to represent diverse sea fog types.

Abstract

Traditionally, sea fog detection technologies have relied primarily on in situ observations. However, point-based observations suffer from limitations in extensive monitoring in marine environments due to the scarcity of observation stations and the limited nature of measurement data. Satellites effectively address these issues by covering vast areas and operating across multiple spectral channels, enabling precise detection and monitoring of sea fog. Despite the increasing adoption of deep learning in this field, achieving further improvements in accuracy and reliability necessitates the simultaneous use of multiple satellite datasets rather than relying on a single source. Therefore, this study aims to achieve higher accuracy and reliability in sea fog detection by employing a deep learning-based advanced co-registration technique for multi-satellite image fusion and autotuning-based optimization of State-of-the-Art (SOTA) semantic segmentation models. We utilized data from the Advanced Meteorological Imager (AMI) sensor on the Geostationary Korea Multi-Purpose Satellite 2A (GK2A) and the GOCI-II sensor on the Geostationary Korea Multi-Purpose Satellite 2B (GK2B). Swin Transformer, Mask2Former, and SegNeXt all demonstrated balanced and excellent performance across overall metrics such as IoU and F1-score. Specifically, Swin Transformer achieved an IoU of 77.24 and an F1-score of 87.16. Notably, multi-satellite fusion significantly improved the Recall score compared to the single AMI product, increasing from 88.78 to 92.01, thereby effectively mitigating the omission of disaster information. Ultimately, comparisons with the officially operational GK2A AMI Fog and GK2B GOCI-II Marine Fog (MF) products revealed that our deep learning approach was superior to both existing operational products.

Keywords:

sea fog; deep learning; image co-registration; GK2A; GK2B

1. Introduction

Sea fog is a significant cause of maritime and aviation accidents due to reduced visibility, making its accurate detection and prediction essential for operational safety [1,2]. Some studies report that the loss of life and property caused by sea fog can be as severe as that resulting from tornadoes or hurricanes [3]. Recently, the increasing frequency of fog occurrence due to climate change has further exacerbated the risk of safety incidents, as factors such as rising sea surface temperatures, enhanced air–sea temperature contrasts, and increased low-level atmospheric humidity create more favorable conditions for sea fog formation in coastal regions [3,4].

Traditionally, sea fog detection technologies have relied primarily on in situ point observations. However, these methods suffer from limitations in extensive monitoring, and detection is particularly challenging in marine environments due to the scarcity of observation stations and the limited nature of measurement data [5,6]. Satellites can cover vast areas and operate across multiple spectral channels, enabling precise detection and monitoring of sea fog [7].

Research on sea fog detection using satellite imagery has employed diverse approaches, including (1) rule-based detection, (2) classical neural networks and machine learning, and (3) deep learning-based image recognition.

Rule-based detection algorithms classify sea fog using pre-defined thresholds and decision rules derived from the physical spectral characteristics of satellite images, such as brightness temperature, reflectance, and RGB composites [5,6,7,8,9,10,11,12,13,14,15,16,17,18]. Operational products from agencies such as EUMETSAT, NOAA, and JMA, as well as GOCI-based algorithms, have demonstrated the feasibility of daytime and nighttime sea fog detection over various regions [8,9,10,11,15]. However, these approaches require region- and sensor-specific tuning and often struggle to generalize across different backgrounds, seasons, and cloud conditions because their thresholds are fixed in space and time [5,6,7,12,13,14,15,16,17].

To alleviate the limitations of purely rule-based schemes, machine learning methods (e.g., KNN, RF, SVM, and ERT) have been applied to satellite-derived variables for sea fog classification and dissipation prediction and even combined with domain adaptation strategies to compensate for the lack of marine labels and to support fog and low-stratus nowcasting from geostationary imagery [19,20,21,22,23,24]. Nevertheless, these models still tend to suffer from overfitting and insufficient use of spatial context, and their performance often degrades when applied beyond the training conditions [21,22,23].

More recently, deep learning-based image recognition techniques, particularly CNN and Transformer architectures, have substantially advanced sea fog detection by learning global and local features directly from multispectral satellite imagery [25,26,27,28,29,30,31,32]. U-Net variants, CNN transfer learning models, hybrid CNN–Transformer networks, and dual-branch architectures have been successfully applied to sensors such as MODIS, GOCI, AHI, and GOCI-II, demonstrating improved accuracy in distinguishing sea fog, low clouds, and the sea surface at the pixel level [24,25,26,27,28,29,30]. These studies also explored multi-satellite inputs to mitigate temporal and spatial constraints, showing the potential of deep learning for near-real-time monitoring [20,28,29,30]. However, most existing approaches still rely on a single satellite or simple combinations of sensors without fully exploiting complementary spectral characteristics, and they are often limited by geometric misalignment between platforms [19,20,29,30]. As a result, there remains a clear need for a deep learning framework that (i) robustly co-registers multi-satellite imagery at high precision and (ii) systematically evaluates state-of-the-art semantic segmentation models for multi-sensor sea fog detection.

Despite the active use of deep learning in sea fog detection, further improvements in accuracy and reliability require the simultaneous use of multiple satellite datasets rather than a single dataset. Data provided by a single satellite is restricted to specific spectral bands. In addition to mid- and thermal-infrared bands, a combination of diverse spectral bands from multiple satellite images is necessary to capture the complex meteorological characteristics of sea fog in greater detail.

In the Korean geostationary satellite constellation, the GK2A AMI mainly provides thermal infrared–oriented fog products, whereas the GK2B GOCI-II offers visible and near-infrared marine fog products optimized for ocean color and atmospheric optical characteristics. In practice, GK2A-based fog outputs tend to miss optically thin or low-level sea fog under shallow cloud layers, while GOCI-II Marine Fog (MF) products frequently overestimate fog by confusing bright high-level clouds or sea-surface reflection patterns with fog (see also Section 4.2) [33,34,35].

These complementary strengths and weaknesses indicate that using only GK2A or only GK2B is insufficient for robust sea fog monitoring and that their combined use is expected to provide more balanced detection performance. However, studies combining deep learning models with multi-satellite imagery for sea fog detection, particularly those jointly exploiting GK2A AMI and GK2B GOCI-II, are rarely found. One reason is that, even after geometric correction, spatial misalignment inevitably occurs due to inaccurate co-registration when overlaying images acquired from different satellites.

In this study, we aim to achieve higher accuracy and reliability in sea fog detection by employing a deep learning-based advanced co-registration technique for multi-satellite image combination and the autotuning-based optimization of State-of-the-Art (SOTA) semantic segmentation models. We used the Advanced Meteorological Imager (AMI) sensor of the Geostationary Korea Multi-Purpose Satellite 2A (GK2A) and the GOCI-II sensor of the Geostationary Korea Multi-Purpose Satellite 2B (GK2B). AMI offers diverse spectral bands, including visible, near-infrared, and thermal infrared bands, while GOCI-II provides several detailed visible bands. Combining the complementary band information from these two satellites can yield richer information for sea fog detection than using single-satellite data alone. Hence, we employed Robust Dense Feature Matching (RoMa), a deep learning-based image co-registration model, to prevent inter-image misalignment between the two types of imagery due to different acquisition times, fields of view, or other environmental factors. Also, a more sophisticated sea fog detection algorithm is constructed by the performance evaluations for multiple deep learning segmentation models through the hyperparameter autotuning with the Optuna library. We focus on the Korean Peninsula and surrounding seas observed by GOCI-II and will evaluate sea fog detection performance, accounting for regional characteristics. This approach is expected to solidify and advance the excellent results demonstrated in previous research, further improving the accuracy and real-time monitoring capabilities of sea fog detection, thereby contributing to the safety of maritime and aviation operations.

2. Materials and Methods

We used geostationary satellite imagery from GK2A AMI and GK2B GOCI-II, together with GK2A cloud-top-height products and in situ observations from ASOS and sea fog observation stations, to construct labeled sea fog datasets and train deep learning segmentation models. Figure 1 illustrates the overall research flow, which is divided into image labeling, image co-registration, and model construction. In the image labeling step, fog annotations were created from AMI and GOCI-II imagery and in situ observation data to construct labeled training data. In the image co-registration part, the RoMa model was used to precisely correct positional discrepancies between the two satellite images, thereby generating 6-channel input data. Finally, during the model construction step, the dataset was split into training, validation, and test sets. Various segmentation models were trained, ultimately producing the final sea fog detection results.

2.1. Overview of GK2A and GK2B Satellites

Our sea fog detection is conducted using both GK2A and GK2B satellite imagery. These two geostationary satellites, developed by the Korea Aerospace Research Institute (KARI), share standard systems and body designs but feature different payloads tailored to their respective missions [36]. GK2A AMI employs a multispectral scanning mirror system that captures 16 spectral channels. A dual-axis scanning mirror rotates in both azimuth and elevation, traversing the entire Earth disk every 10 min. Each scan line is collected at a nadir-based spatial resolution of approximately 2 km [37]. In contrast, GK2B GOCI-II employs a push-broom design with 12 independent slot Charge-Coupled Device (CCD) arrays to observe Northeast Asian waters at high resolution repeatedly. By dividing the observation area into 12 segments, the satellite sequentially scans ground pixels in each slot as it orbits the Earth. This enables 10 daily acquisitions at 250 m spatial resolution and one daily acquisition at 1000 m resolution covering the entire hemisphere. A key feature is the application of Time-Delay Integration (TDI) techniques for each slot, which enhance the signal-to-noise ratio (SNR) while enabling precise analysis of complex ocean and atmospheric color characteristics [38]. GK2A was designed for meteorological disaster monitoring and improved forecasting, while GK2B was designed to support marine environmental change monitoring and coastal management. As such, the two satellites produce specialized information for meteorological and ocean observations through their distinct payloads. Jointly using AMI and GOCI-II imagery effectively combines their complementary spectral information, significantly enhancing the performance of sea fog detection. Our study area is the Korean Peninsula and adjacent Northeast Asian waters within the common observation zone of both satellites (Figure 2). For spatial colocation, we restricted GK2A AMI local-area scenes to the footprint of the GOCI-II slots covering this region. The spectral band configurations provided by the two satellites are summarized in Table 1 [33,34].

2.2. Labeling Annotation Data for Sea Fog Detection

Using Closed-Circuit Television (CCTV) footage from the Korea Hydrographic and Oceanographic Agency (KHOA) and other sources, we identified the dates and locations where sea fog occurred from January 2023 to July 2024. We collected AMI L1B and GOCI-II L1B satellite imagery captured between 8:00 AM and 12:00 PM on the confirmed sea fog occurrence dates. AMI data is provided at 10 min intervals, while GOCI-II data acquisition completes around 28 min past the hour for the 7th slot. To use both satellite images, we selected image pairs whose acquisition times differed by less than 30 min and whose footprints overlapped the common study area shown in Figure 2.

Reviewing the literature to select effective bands for sea fog identification revealed that the false-color composite (FCC) of 0.64 μm, 1.6 μm, and 11.2 μm bands was reported to be effective for Himawari 8 AHI. In comparison, the FCC bands at 0.865 μm, 0.443 μm, and 0.412 μm proved effective for GOCI images [26,30]. These FCCs can also be valid for GK2A AMI and GK2B GOCI-II because AMI has the same band configuration as AHI, and GOCI-II carries all three GOCI bands. Sea fog, composed of fine water droplets, exhibits strong reflectance in the 0.64 μm (visible) band due to scattering by these droplets in sunlight. Specifically, within the 0.4–0.7 μm range, sea fog generates a clear reflectance signal due to uniform scattering. However, in the 1.6 μm (shortwave infrared) channel, absorption by water becomes prominent. Crucially, clouds tend to exhibit more absorption and less reflectance than sea fog in this shortwave infrared channel due to their larger droplet size and longer absorption path, following the principles of micro-scattering theory. This contrast in spectral properties allows for the effective differentiation of sea fog from clouds through spectral analysis of satellite imagery [38]. Also, 11.2 μm (thermal infrared) brightness temperature provides information that sea fog is colder than the surrounding sea surface. Reflecting these spectral characteristics, we selected the AMI 0.64 μm, 1.6 μm, and 11.2 μm bands, along with the GOCI-II 0.412 μm, 0.443 μm, and 0.865 μm bands as the input channel for the sea fog detection model. GOCI-II FCC images provide detailed spectral and texture information. Sea fog typically displays a smooth pink surface texture, while clouds exhibit a relatively coarse white surface texture in GOCI-II FCC images [26].

We performed annotation work for sea fog detection using FCC images from AMI and GOCI-II, AMI cloud top height (CTH) data, and in situ data from ASOS and sea fog observation stations (Figure 3). To create the FCC images for effective identification of sea fog, we used the above-selected three AMI bands as the RGB channels for the AMI FCC imagery and the three selected GOCI-II bands as the RGB channels for the GOCI-II FCC imagery. Cross-analysis of the FCC images, CTH data, and field measurements for fog-occurrence days was conducted to derive common fog characteristics. The AMI FCC images showed fog areas in light green, while the GOCI-II FCC images displayed them in a color close to pink. Drawing on the concept of surface homogeneity for low clouds and fog presented in [18], we identified and extracted fog areas using color and texture in the two FCC images. Furthermore, AMI CTH data was also utilized to reflect the characteristics of sea fog, which is close to the ground and thus at low altitude.

Based on these data, the observation-based labeling followed a three-step procedure.

First, for coastal regions near the ASOS and sea fog observation stations, we delineated initial fog seed areas around stations reporting sea fog by selecting pixels whose FCC color/texture and CTH values were consistent with the typical fog signatures described above.

Second, these seed areas were extended offshore by region-growing along contiguous pixels that preserved homogeneous fog-like FCC textures and low CTH values, allowing fog labels to be assigned over maritime areas without direct in situ coverage.

Third, in offshore regions far from any ground-based station, candidate fog patches were identified solely from FCC and CTH patterns; only pixels forming clearly homogeneous and persistent fog-like structures were labeled as sea fog, whereas ambiguous areas were conservatively assigned as non-sea-fog.

The operational GK2A AMI Fog and GK2B GOCI-II Marine Fog (MF) products were intentionally not used during the labeling stage so that our reference labels would be independent of any existing algorithm and could later serve as an unbiased benchmark for comparing the proposed deep learning model with the operational products.

Given reduced visibility in sea fog conditions, a comparative analysis was conducted using in situ visibility measurement data from ASOS and sea fog observation stations. Specifically, we used visibility observations from 95 land-based ASOS stations with an hourly reporting interval and from 11 coastal sea fog observation stations with a 1-min resolution. Both ASOS and sea fog stations provide visibility measurements, which were jointly used as reference information for sea fog labeling. Since such features of sea fog were difficult to extract automatically [18], annotation tools were used to perform precise manual annotation (Table 2).

Overall, 86 annotated scenes were selected from confirmed sea-fog days, each corresponding to a daytime case in which sea fog was clearly present in the FCC imagery. Across all annotated pixels in these 86 fog-containing scenes, sea fog and non-sea-fog pixels account for 4% and 96%, respectively; this class imbalance may cause minority class bias during training, necessitating imbalance correction. Figure 4 summarizes the monthly sampling distribution of the labeled dataset for transparency. Long-term observational studies report that sea fog around the Korean Peninsula exhibits strong seasonality, with higher occurrence typically during the warm season from late spring to summer and higher frequencies over the Yellow/West Sea than over the East Sea [39,40]. Accordingly, our labeled dataset was constructed by selecting confirmed fog days rather than by uniform temporal sampling. Almost 80% of our data is from May and June, while early spring (March and April) has almost 20%, and late winter (February) has less than 5%.

2.3. Co-Registration for AMI and GOCI-II Satellite Images

The GK2A and GK2B satellites acquire imagery using different sensor characteristics and observation methods. Therefore, ensuring spatial alignment is essential for simultaneously utilizing both images. We performed a co-registration procedure to achieve precise fusion of AMI and GOCI-II imagery. Image co-registration is the process of overlaying two or more images of the same scene, captured at different times, from different perspectives, or by other sensors. First, unique control points, such as closed boundaries, edges, contours, and line intersections, are automatically or manually detected in both images. Then, corresponding control point pairs are identified using various feature descriptors, similarity measures, and spatial relationships. Subsequently, based on these correspondences, a mapping function such as a homography is estimated. Finally, the estimated mapping function is applied to align the coordinate systems of the detected and reference images. Non-integer pixel coordinates are interpolated to obtain the final matched image [41].

We employed RoMa, a deep learning-based feature detection and matching model (Figure 5). RoMa provides powerful capabilities for estimating pixel-level dense warps and their uncertainties, demonstrating high robustness by leveraging the pre-trained Distillation of Knowledge with No Labels Version 2 (DINOv2). DINOv2 is a Vision Transformer based on self-supervised learning that utilizes a match decoder to predict anchor probabilities and processes multimodal representations, enabling more sophisticated matching than traditional local features. Furthermore, it introduces a regression-based classification loss function to enhance matching performance, achieving results that are much better than those of existing methods [42]. Traditional Scale-Invariant Feature Transform (SIFT) and Oriented FAST and Rotated BRIEF (ORB) algorithms are manually designed feature descriptors. While SIFT offers scale and rotation invariance and ORB provides fast computation and efficiency, their performance is limited under conditions of lighting changes, distortions, or a lack of texture. RoMa overcomes these limitations by combining deep features from DINOv2 with details extracted from a ConvNet to achieve more reliable matching.

The RoMa model can automatically detect and match features in AMI and GOCI-II images. Based on the matched features, we computed the homography matrix between the GOCI-II and AMI images. The computed homography matrix was applied to the GOCI-II image to register the two images. A comparison before and after registration revealed a significant spatial mismatch between the AMI and GOCI-II images, attributed to coordinate differences before co-registration. However, after co-registration, the two images were found to be noticeably aligned. We used the original RoMa implementation with publicly released pre-trained weights and did not perform additional fine-tuning on our dataset, because RoMa is designed as a general-purpose dense matcher and our primary objective was to generate geometrically consistent fusion inputs. The registration performance was evaluated qualitatively by visually inspecting the alignment of coastlines and major geographical features with the KHOA shoreline data, and we confirmed that the residual misalignment was negligible at the native resolution for all selected scenes. Figure 6 shows the comparisons of the reference AMI FCC image and the GOCI-II FCC image before and after co-registration, providing a visually consistent basis for multi-sensor comparison. The AMI FCC image represents land surfaces in dark green and sea areas in dark blue, while the GOCI-II FCC image depicts land surfaces in dark red and sea areas in dark cyan. Additionally, the white solid line represents the coastline data provided by KHOA. Before the registration, the shoreline in the GOCI-II FCC image was slightly misaligned with the KHOA shoreline. After co-registration, both images achieved precise spatial alignment, accurately matching the shorelines.

2.4. Training Deep Learning Segmentation Models

We adopted relatively SOTA semantic segmentation models for CNN (OCRNet, ConvNeXt-L, and SegNeXt) and Transformer (SegFormer, Swin Transformer, and Mask2Former) [43,44,45,46,47,48], which demonstrate strong performance in satellite remote sensing and can effectively process the complex image features required for sea fog monitoring. While CNN-based models leverage the characteristics of convolutional neural networks to integrate local features and multi-resolution information effectively, Transformer-based models excel at capturing global as well as local contextual information, enabling a more precise understanding of complex image patterns. By training both types of models and comparing their results, we aim to identify the most suitable modeling approach for sea fog detection.

Spatially aligned AMI and GOCI-II images were cropped into identical regions and resampled with dimensions of 1024 × 1024 pixels, resulting in the six-channel fusion input composed of the three AMI and three GOCI-II bands described in Section 2.2. The entire training dataset consists of 1002 patches for input and label data, obtained from 86 scenes captured on 24 foggy days during seven months in 2023 and 2024. To facilitate evaluation, 10 scenes (120 patches) from two fog days were assigned to the validation set, and another 10 scenes (120 patches) from two different fog days were allocated to the test set. The remaining 66 scenes (762 patches) from 20 foggy days were used as the training set. These splits were performed on a scene-by-scene basis, meaning that all patches from a given scene were assigned to the same subset rather than being randomly mixed at the patch level. This ensures separation between the train, validation, and test sets, which is necessary to avoid overfitting of the deep learning model. In addition, the fog/non-fog pixel ratios in the training, validation, and test sets were kept close to the overall 4% vs. 96% distribution shown in Table 2. Although the number of training scenes is limited, they cover multiple sea fog events over different days and regions around the Korean Peninsula, providing a diverse set of fog patterns for model learning. Transfer learning was conducted for efficient learning and rapid optimization. The weights from the first input layer of a 3-channel model pre-trained on the existing ADE20K dataset were imported and scaled to fit the 6-channel input. Input images underwent normalization using channel-wise mean and standard deviation before being fed to the model during training. Subsequently, data augmentation was applied using geometric transformations such as random resizing, cropping, and flipping. This procedure exposed the model to diverse data variations, enhancing its generalization performance. In combination with transfer learning, these augmentation strategies were adopted to effectively expand the variability of the training data and compensate for the limited number of original scenes.

The hyperparameters used during model training were autotuned by Optuna to optimize training efficiency and performance. Optuna is a Bayesian optimization framework that samples candidate hyperparameter combinations within the search space per trial, repeatedly training and evaluating the model for each combination to derive optimal hyperparameter values. Unlike grid search or random search, it offers the advantage of probabilistic exploration of optimal combinations in the N-dimensional hyperparameter space, even when new datasets are added or experimental conditions change. We set the ranges of learning rate, batch size, dropout ratio, weight decay, and background class weight for the hyperparameter space (Table 3). Among the five trials, the combination that ultimately achieved the highest performance was adopted as the optimal hyperparameters. Moreover, background class weighting was applied to suppress background overconfidence and mitigate undetected fog.

All models were trained on an NVIDIA GeForce RTX 3090 Ti (24 GB) with an Intel(R) Core(TM) i9-12900K and 64 GB of RAM. The average training time was approximately 15 h per model until convergence, including five Optuna trials, using the configuration described above. For inference, we report the runtime using a representative Transformer-based segmentation model, which required approximately 0.2 s per 1024 × 1024 patch and approximately 2.4 s per full scene when processing the scene as multiple 1024 × 1024 patches (on average ~12 patches per scene in our dataset). These runtimes indicate that near-real-time daytime sea fog monitoring is feasible when considering the operational update cycles of GK2A AMI (10 min) and GK2B GOCI-II.

To quantitatively evaluate the fog detection model, several commonly used pixel-level classification metrics were employed. By comparing the model’s inference results with the labeled image pixel-by-pixel, true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) were calculated for the segmented images. Based on this, Intersection over Union (IoU), Accuracy, Precision, Recall, and F1-score were calculated to evaluate the model’s segmentation performance.

Intersection over Union (IoU) is defined as the ratio of the intersected area between the predicted and actual fog area to the union area of both. A high IoU value indicates that the model closely matches the actual sea fog area.

IoU = TP/(TP + FP + FN)

Accuracy is the ratio of correctly classified pixels to the total number of pixels. While Accuracy provides an intuitive overview of overall performance, it is used as a supplementary metric in scenarios with class imbalance, such as sea fog.

Accuracy = TP/(TP + FP + FN + TN)

Precision is the ratio of actual fog pixels to the pixels predicted as fog by the model. High precision indicates that the model’s predictions do not tend to overestimate.

Precision = TP/(TP + FP)

False Alarm Ratio (FAR) is defined as the proportion of false positives among all predicted fog pixels.

FAR = FP/(TP + FP) = 1 − Precision

Recall is the ratio of correctly detected fog pixels to the actual fog pixels. This value indicates the model’s sensitivity; a higher value indicates a lower tendency to underestimate.

Recall = TP/(TP + FN)

The F1-score is the harmonic mean of precision and recall, balancing the two metrics. It reflects overall performance degradation when one metric is extremely low.

F1 score = 2 × (Precision × Recall)/(Precision + Recall)

Thus, each metric contributes to evaluating the model’s fog detection performance and prediction accuracy from multiple criteria and was used to analyze the overall segmentation performance quantitatively.

3. Results

The optimized hyperparameters by Optuna in Table 4 were selected as the final configuration for each model. Test datasets were evaluated by the optimized models, and the evaluation metrics for the sea fog detection were summarized in Table 5. While all models achieved over 98% accuracy, it is because background pixels overwhelmingly dominate the images. Rather, metrics such as IoU, Precision, and Recall are better suited for comparing model performance. Swin Transformer achieved the highest performance with an IoU of 77.242, followed closely by Mask2Former (76.118) and SegNeXt (76.110). In contrast, SegFormer (72.049) and OCRNet (71.194) had relatively low IoU. In terms of Precision, the Swin Transformer (82.796) achieved the highest score, indicating high confidence in the predicted fog areas. For Recall, SegNeXt (93.616) was the highest, demonstrating strength in not missing actual fog pixels. The F1-score, which balances the two metrics, was highest for Swin Transformer (87.160), with Mask2Former (86.440) and SegNeXt (86.435) showing very close performance, confirming them as particularly effective models for sea fog detection.

Figure 7 provides an overview of the FCC and labeled images for the test dataset, along with the inference results from each detection model, thereby complementing the quantitative comparison in Table 5 with qualitative FP/FN patterns across architectures. The segmentation maps for the model results are black for TN, white for TP, orange for FP, and red for FN. Overall, all models showed high rates of TP and TN. Notably, the Transformer-based Swin Transformer and Mask2Former, along with the CNN-based SegNeXt, recorded the highest TP and TN rates and the lowest FP and FN rates, suggesting excellent fog detection performance. In contrast, OCRNet exhibited lower FP but somewhat higher FN, while SegFormer showed lower FN but a slightly higher FP. Overall, Transformer-based models, particularly Swin Transformer, demonstrated superior performance compared to CNN-based models in precisely detecting complex sea fog boundaries, achieving high IoU and F1-scores.

Table 6 shows the evaluation metrics for the Swin Transformer model across the test dataset slots. Slots containing no fog pixels or extremely rare fog pixels were excluded from evaluation. S007 (Korean Peninsula) and S010 (Southern Bohai Sea and Eastern China) showed the highest performance with an IoU of 92.114 and 89.658 and an F1-score of 95.895 and 94.547, respectively. Precision of 94.612 and 92.970, and the recall of 97.214 and 96.179, also mean balanced performance. S004 (Central-Southern East Sea and East Coast) also showed excellent results, with an IoU of 81.899 and an F1 of 90.049. At the same time, S005 (Northeast East Sea and Northwest Japan) demonstrated strength in minimizing fog omission, with a recall of 95.681. Conversely, S001 (Southern Japanese Waters) and S002 (Eastern Japanese Waters) had low precision of 49.439 and 58.331, resulting in IoU levels of 49.050 and 51.746. Although Recall was high at 98.423 and 82.090 for these two regions, false detections increased due to the combination of complex backgrounds and low-contrast features such as clouds, sea surface reflections, and haze. In addition, these slots contained relatively fewer fog cases in the training data, which may have limited the model’s ability to learn region-specific fog–background characteristics and contributed to the reduced precision. S008 (Northern East Sea and Southern Primorsky Krai) showed high precision of 94.658 but low recall of 68.812, indicating a tendency for increased omissions. In summary, the model operated stably in areas with frequent and distinct fog patterns, such as S007, S010, and S004. Conversely, in areas with a complex background and relatively fewer fog cases, such as S001 (Southern Japanese Waters) and S002 (Eastern Japanese Waters), an increase in false positives or false negatives was observed. These waters are strongly influenced by warm currents, such as the Kuroshio Current, resulting in relatively high sea surface temperatures. Consequently, the fog here may consist of a mixture of radiation fog or steam fog—which arises from the condensation of warm, moist air itself—rather than the typical advection fog (formed by warm air cooling over a cold sea surface). Radiation and steam fogs often have a thin optical thickness, making them difficult for satellite sensors to capture. Furthermore, the high frequency of mid- and high-level clouds in these areas, due to the passage of low-pressure systems or the influence of seasonal rain fronts, makes it challenging to distinguish fog from clouds.

4. Discussion

4.1. Comparisons with Currently Operational Products

Figure 8 compares the officially operational AMI Fog and GOCI-II Marine Fog (MF) products, respectively, along with the results predicted by the Swin Transformer model. Overall, the AMI Fog product showed a tendency toward underestimation, particularly a significant tendency to exclude sea fog in areas associated with shallow or optically thin cloud. In contrast, our deep learning model reduced omission errors by detecting sea fog signals in such conditions when fog features remained distinguishable in the FCC imagery and low-level CTH, leveraging the spatial–spectral context of surrounding pixels. It should be noted that fog fully obscured by optically thick cloud layers remains undetectable using optical imagery alone. Meanwhile, the GOCI-II MF product exhibited a strong tendency toward overestimation, with instances observed in which it misidentified areas of high-level clouds (shown in pink on the AMI FCC) as sea fog. In contrast, our model demonstrated relatively stable and balanced detection performance compared to existing operational outputs. This suggests the model effectively mitigated both under- and overestimation issues by comprehensively learning diverse spectral and spatial patterns.

4.2. Advantages from Multi-Sensor Image Fusion

Table 7 and Figure 9 show the results comparing the single-satellite and multi-satellite models. The multi-satellite model using AMI and GOCI-II showed slightly improved performance compared to the single-satellite model, with an IoU of 77.242 and an F1-score of 87.160. However, while the multi-satellite model had a somewhat lower Precision of 82.796 compared to the standalone AMI model’s 84.130, Recall significantly improved from 88.782 to 92.009. This indicates that the multi-satellite model experienced an increase in false positives but a reduction in missed fog pixels. In the fog detection problem, fog pixels are not only a minority class but also areas that pose potential safety risks across various fields, such as transportation and navigation. Therefore, the importance of Recall, which ensures the detection of all actual fog areas without omission, is particularly significant. Securing high Recall over Accuracy or Precision is crucial for reliably detecting sea fog from the perspective of preventing risks before they occur. These results demonstrate that complementary information from different sensors, provided by AMI and GOCI-II, effectively captures the overall pattern, improving not only IoU and F1-score but also Recall, a core metric for fog detection. Future research should refine the model by enhancing multi-sensor fusion strategies and expanding the dataset to minimize false detections while reducing fog detection omissions.

4.3. An In-Depth Case Study

To validate the actual detection performance, two distinct fog-inflow cases were analyzed for an in-depth analysis of the Swin Transformer model’s inference results (Figure 10). The first was a widespread advection fog case observed on 5 July 2024, along the West Coast, Jeju Island, and Busan. The second was a localized fog case declared near Daesan Port on 9 August.

In the 5 July case, the Swin Transformer model successfully detected widespread fog over the West Coast and waters off Jeju Island. Marine weather observation buoy data from multiple points showed air temperatures higher than sea surface temperatures and humidity exceeding 90%, precisely matching the typical formation conditions for advection fog: warm, moist air cooling as it passes over the cold sea surface. This pattern of dense, uniformly developed advection fog across such a broad area was also clearly visible in AMI and GOCI-II FCC imagery, as indicated by the location labels in Figure 10. This uniformity was a key factor enabling the model to extract and detect its features reliably. However, sea fog near Busan on the same day was not detected. Analysis of satellite imagery revealed that the sea fog around Busan was very sparse and exhibited a complex cloud structure, with low-level and high-level clouds mixed. This low contrast and complex background are analyzed as having made it difficult for the model to clearly distinguish the unique signal of sea fog. This case demonstrates that the distributions of sea fog and the interactions with surrounding clouds can be critical variables that affect the detection performance of deep learning models.

For the 9 August Daesan Port case, our Swin Transformer model showed highly impressive performance. Although the sea fog in that area was too faint to identify in a true-color image, the model successfully detected it. This demonstrates the model’s high sensitivity, capable of detecting subtle spectral changes beyond human visual perception because we used six channels, including visible, shortwave, and thermal infrared bands from AMI and GOCI-II. Buoy data at the time suggested potential mixing of air masses with locally distinct characteristics, which could be associated with conditions for fog or light vapor fog, which can occur in bay terrain. The model is judged to have effectively detected this sea fog with weak features by utilizing specific spectral band information.

Overall, the Swin Transformer model demonstrated strong reliability in detecting widespread, dense drift fog patterns. Conversely, it was confirmed that performance could be limited when thin drift fog is mixed with complex cloud structures, as seen in the Busan case. Also, the Daesan Port case demonstrated high sensitivity, exceeding human observational capabilities, for faint, localized sea fog that is difficult to identify visually. These results suggest that deep learning-based sea fog detection models may exhibit varying performance depending on the physical causes of fog formation and its visual characteristics (concentration, morphology, and surrounding environment). Specifically, it was confirmed that the key factor determining performance is the model’s ability to learn the unique spectral and morphological features of each fog type. Therefore, future research should focus on introducing data augmentation strategies tailored to the characteristics of each fog type and on constructing detailed datasets that cover diverse cases. This will be crucial for improving generalization, enabling the model to ensure robust performance even in light fog or complex meteorological conditions.

4.4. Limitations and Future Directions

The results according to slots showed that not only model architecture but also regional and meteorological factors significantly influence detection performance. Slots S007 (Korean Peninsula) and S010 (Southern Bohai Sea/Eastern China) exhibited frequent fog occurrences with distinct patterns, resulting in a high IoU and F1-score exceeding 90. Conversely, slots S001, S002, and S008 showed significantly degraded detection performance due to complex background factors like clouds, low-level clouds, and sea surface reflections. These results suggest that environmental conditions in the observation areas directly influence detection performance differences, extending beyond mere limitations of the model architecture. Particularly in some slots, the insufficient number of sea fog cases may have prevented regional characteristics from being adequately reflected during training and validation. Therefore, future research should construct datasets with sufficient sea fog cases per slot and establish environments that enable balanced learning and evaluation of regional meteorological and oceanic characteristics. In particular, sampling strategies informed by climatology or fog distribution should be explored to reduce regional sampling bias, including stratified sampling by slot and targeted data collection for underrepresented regions such as S001 and S002. In addition, an independent labeling review by human experts, such as operational forecasters or ocean and meteorology specialists, and a quantitative inter-annotator agreement analysis should be conducted to further validate the reliability of the reference labels.

Several directions can be proposed to address these limitations in future research. First, dataset augmentation and expansion are needed to mitigate the spatiotemporal bias and imbalance in fog data. Acquiring additional fog cases across various seasons and regions, combined with the application of image augmentation techniques, could enhance the model’s generalization performance. Additionally, using generative models like Generative Adversarial Network (GAN) or diffusion models to artificially generate fog patterns that are difficult to obtain from actual observations could be considered to ensure diversity in the training data. Furthermore, designing a hybrid network architecture that combines CNNs’ local feature extraction with Transformers’ global representation learning is expected to enable more sophisticated detection of complex, ambiguously bounded fog areas. Finally, adding a post-processing module that uses auxiliary meteorological variables, such as Cloud Top Height (CTH) and relative humidity, could account for physical constraints, reduce false detections, and enhance detection reliability.

Figure 11 shows the input image, label image, model prediction result, and AMI CTH image for the case where the Swin Transformer recorded the lowest IoU across the entire test set. The AMI CTH image visualizes cloud top height, where pixels at or above 3 km altitude or without clouds are rendered completely black, while values between 0 and 3 are gradient-shaded from white to black. In this case, FPs were the predominant type. Areas predicted by the model as sea fog were identified as non-sea fog in the label image. Examining the AMI CTH for this region reveals a cloud top height below 3 km, with a mixture of dark and light gray tones. Furthermore, the AMI FCC imagery also lacks a homogeneous texture, making it difficult to attribute this to annotation errors and suggesting a genuine mixture of low-level clouds and sea fog. Therefore, in such complex areas, the model is interpreted as making a slight misclassification. Furthermore, the GOCI-II FCC imagery appears to have produced numerous no-data regions during registration, resulting in data discontinuities. This data incompleteness is also considered to have contributed to the model’s misclassification. To address these issues, appropriate preprocessing and correction techniques for no-data regions must be introduced. Furthermore, model improvements that leverage additional feature information, such as cloud height and texture, are required to better distinguish subtle differences between low-level clouds and sea fog. Future efforts should include re-examining annotations and conducting additional case studies to develop strategies for more precisely enhancing model performance in complex areas.

5. Conclusions

This study combined complementary spectral information from the geostationary satellite sensors GK2A AMI and GK2B GOCI-II for daytime sea fog detection. It precisely corrected the spatial misalignment of the imagery through deep learning-based co-registration (RoMa), then trained and evaluated the latest semantic segmentation model using a 6-channel fusion input. Swin Transformer, Mask2Former, and SegNeXt demonstrated balanced, excellent performance across overall metrics such as IoU and F1-score. Notably, multi-satellite fusion significantly improved Recall compared to single AMI to mitigate missing disaster information. These results indicate that combining sensors with differing spectral and spatiotemporal characteristics contributes to enhanced sensitivity in sea fog segmentation.

Additional analysis revealed performance variations across regions and time slots. Areas adjacent to the Korean Peninsula and the East Sea supported the model’s effectiveness, with high IoU and F1-scores, while performance in some regions was limited due to insufficient data and complex backgrounds. Furthermore, factors such as low-level cloud overlap, low contrast, and no-data during the registration process could cause false positives and false negatives. At the same time, class imbalance also limited the interpretability of the accuracy. This indicates that environmental context and data distribution directly impact model performance, and that the quality of integration and preprocessing determines actual operational capability.

Future work should focus on hybrid designs combining the Transformer’s global context learning with the CNN’s local detail extraction and on constructing datasets with expanded sea fog cases, correcting integration errors and no-data, integrating auxiliary features like CTH and texture, and incorporating GAN-based augmentation for fog types (e.g., advection and steam). The methodology presented in this study, encompassing the entire process of multi-satellite fusion, alignment, and segmentation, is expected to substantially enhance fog detection systems and improve maritime and aviation operational safety.

Author Contributions

Conceptualization, J.K. (Jonggu Kang) and Y.L.; methodology, J.K. (Jonggu Kang) and Y.L.; formal analysis, J.K. (Jonggu Kang); data curation, J.K. (Jonggu Kang); writing—original draft preparation, J.K. (Jonggu Kang); writing—review and editing, H.M., S.H.K., M.K., D.K., J.K. (Jinsoo Kim) and Y.L.; supervision, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a grant (2021-MOIS37-002) from the Intelligent Technology Development Program on Disaster Response and Emergency Management funded by the Ministry of Interior and Safety (MOIS, Republic of Korea).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Author Hiroyuki Miyazaki was employed by the company Global Data Lancers (GLODAL) Incorporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

AHI	Advanced Himawari Imager
AMI	Advanced Meteorological Imager
ASOS	Automated Surface Observing System
CNN	Convolutional Neural Network
CTH	Cloud Top Height
DINOv2	Distillation of Knowledge with No Labels v2
EUMETSAT	European Organisation for the Exploitation of Meteorological Satellites
F1-score	F1-score
FAR	False Alarm Ratio
FCC	False Color Composite
FN	False Negative
FP	False Positive
GAN	Generative Adversarial Network
GK2A	GEO-KOMPSAT-2A (Geostationary Korea Multi-Purpose Satellite 2A)
GK2B	GEO-KOMPSAT-2B (Geostationary Korea Multi-Purpose Satellite 2B)
GOCI-II	Geostationary Ocean Color Imager II
IoU	Intersection over Union
JMA	Japan Meteorological Agency
KARI	Korea Aerospace Research Institute
KHOA	Korea Hydrographic and Oceanographic Agency
MODIS	Moderate Resolution Imaging Spectroradiometer
MSG	Meteosat Second Generation
NIR	Near-Infrared
NOAA	National Oceanic and Atmospheric Administration
OCRNet	Object-Contextual Representations Network
ORB	Oriented FAST and Rotated BRIEF
RGB	Red–Green–Blue
RoMa	Robust Dense Feature Matching
SIFT	Scale-Invariant Feature Transform
SOTA	State-of-the-Art
SWIR	Shortwave Infrared
TIR	Thermal Infrared
TN	True Negative
TP	True Positive
U-Net	U-shaped Network
VIS	Visible

References

Lee, J.Y.; Kim, K.J.; Son, Y.T. Operation Measures of Sea Fog Observation Network for Inshore Route Marine Traffic Safety. J. Korean Soc. Mar. Environ. Saf. 2023, 29, 188–196. [Google Scholar] [CrossRef]
Korea Institute of Ocean Science and Technology. Development of AI-Based Coastal Disaster Modelling Platform and Sea-Fog Prediction System; Ministry of Oceans and Fisheries: Busan, Republic of Korea, 2021. [Google Scholar]
Gultepe, I.; Tardif, R.; Michaelides, S.C.; Nagai, T.; Bott, A.; Bendix, J.; Mueller, M.D.; Kim, D.Y.; Pagowski, M.; Hansen, B.; et al. Fog research: A review of past achievements and future perspectives. Pure Appl. Geophys. 2007, 164, 1121–1159. [Google Scholar] [CrossRef]
Korea Institute of Ocean Science and Technology. Planning Research for Development of Marine Security, Disaster and Space Management Technology; Ministry of Oceans and Fisheries: Busan, Republic of Korea, 2018. [Google Scholar]
Cermak, J.; Bendix, J. A novel approach to fog/low stratus detection using Meteosat 8 data. Atmos. Res. 2008, 87, 279–292. [Google Scholar] [CrossRef]
Wu, X.; Li, S. Automatic sea fog detection over Chinese adjacent oceans using Terra/MODIS data. Int. J. Remote Sens. 2014, 35, 7430–7457. [Google Scholar] [CrossRef]
Amani, M.; Mahdavi, S.; Bullock, T.; Beale, S. Automatic nighttime sea fog detection using GOES-16 imagery. Atmos. Res. 2020, 238, 104712. [Google Scholar] [CrossRef]
EUMETSAT Fog/Low Clouds RGB-MSG-0 Degree. Available online: https://navigator.eumetsat.int/product/EO:EUM:DAT:MSG:FOG (accessed on 1 November 2025).
NOAA Low Cloud and Fog. Available online: https://www.goes-r.gov/products/opt2-low-cloud-fog.html (accessed on 1 November 2025).
NOAA Low-Level Cloud and Fog Product. Available online: https://rammb.cira.colostate.edu/research/goes-r/proving_ground/cira_product_list/synthetic_nssl_wrf-arw_imagery_1035m39_microns.asp (accessed on 1 November 2025).
JMA—Fog Monitoring. Available online: https://www.data.jma.go.jp/mscweb/en/product/monitor_fog.html (accessed on 1 November 2025).
Wu, D.; Lu, B.; Zhang, T.; Yan, F. A method of detecting sea fogs using CALIOP data and its application to improve MODIS-based sea fog detection. J. Quant. Spectrosc. Radiat. Transf. 2015, 153, 88–94. [Google Scholar] [CrossRef]
He, L.; Gao, H.; Xu, J.; Huang, G. Application of Himawari-8 Satellite Data in Daytime Sea Fog Monitoring in South China Coast. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2019; Volume 237, p. 022007. [Google Scholar] [CrossRef]
Ryu, H.-S.; Hong, S. Sea Fog Detection Based on Normalized Difference Snow Index Using Advanced Himawari Imager Observations. Remote Sens. 2020, 12, 1521. [Google Scholar] [CrossRef]
Yuan, Y.; Qiu, Z.; Sun, D.; Wang, S.; Yue, X. Daytime Sea Fog Retrieval Based on GOCI Data: A Case Study over the Yellow Sea. Opt. Express 2016, 24, 787–801. [Google Scholar] [CrossRef]
Lee, K.; Yoon, H.-J.; Kwon, B.-H. Sea Fog Detection Algorithm Using Visible and Near Infrared Bands. J. Korea Inst. Electron. Commun. Sci. 2018, 13, 669–676. [Google Scholar] [CrossRef]
Heo, K.-Y.; Min, S.-Y.; Ha, K.-J.; Kim, J.H. Discrimination between Sea Fog and Low Stratus Using Texture Structure of MODIS Satellite Images. Korean J. Remote Sens. 2008, 24, 571–581. [Google Scholar]
Kim, D.; Park, M.-S.; Park, Y.-J.; Kim, W. Geostationary Ocean Color Imager (GOCI) Marine Fog Detection in Combination with Himawari-8 Based on the Decision Tree. Remote Sens. 2020, 12, 149. [Google Scholar] [CrossRef]
Wang, Y.; Hu, C.; Qui, Z.; Zhao, D.; Wu, D.; Liao, K. Research on the Multi-Source Satellite Daytime Sea Fog Detection Technology Based on Cloud Characteristics. J. Trop. Oceanogr. 2023, 42, 15–28. [Google Scholar] [CrossRef]
Wang, Y.; Qiu, Z.; Zhao, D.; Ali, M.A.; Hu, C.; Zhang, Y.; Liao, K. Automatic Detection of Daytime Sea Fog Based on Supervised Classification Techniques for FY-3D Satellite. Remote Sens. 2023, 15, 2283. [Google Scholar] [CrossRef]
Liu, D.; Ke, L.; Zeng, Z.; Zhang, S.; Liu, S. Machine Learning-Based Analysis of Sea Fog’s Spatial and Temporal Impact on Near-Miss Ship Collisions Using Remote Sensing and AIS Data. Front. Mar. Sci. 2025, 11, 1536363. [Google Scholar] [CrossRef]
Han, J.H.; Kim, K.J.; Joo, H.S.; Han, Y.H.; Kim, Y.T.; Kwon, S.J. Sea Fog Dissipation Prediction in Incheon Port and Haeundae Beach Using Machine Learning and Deep Learning. Sensors 2021, 21, 5232. [Google Scholar] [CrossRef]
Xu, M.; Wu, M.; Guo, J.; Zhang, C.; Wang, Y.; Ma, Z. Sea fog detection based on unsupervised domain adaptation. Chin. J. Aeronaut. 2022, 35, 415–425. [Google Scholar] [CrossRef]
Bari, D.; Lasri, N.; Souri, R.; Lguensat, R. Machine Learning for Fog-and-Low-Stratus Nowcasting from Meteosat SEVIRI Satellite Images. Atmosphere 2023, 14, 953. [Google Scholar] [CrossRef]
Zhu, C.; Wang, J.; Liu, S.; Hui, S. Sea Fog Detection Using U-Net Deep Learning Model Based on MODIS Data. In Proceedings of the 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands, 24–26 September 2019; pp. 1–5. [Google Scholar]
Jeon, H.-K.; Kim, S.; Edwin, J.; Yang, C.-S. Sea Fog Identification from GOCI Images Using CNN Transfer Learning Models. Electronics 2020, 9, 311. [Google Scholar] [CrossRef]
Guo, X.; Wan, J.; Liu, S.; Xu, M.; Sheng, H.; Yasir, M. A scSE-LinkNet Deep Learning Model for Daytime Sea Fog Detection. Remote Sens. 2021, 13, 5163. [Google Scholar] [CrossRef]
Lu, H.; Ma, Y.; Zhang, S.; Yu, X.; Zhang, J. Daytime Sea Fog Identification Based on Multi-Satellite Information and the ECA-TransUnet Model. Remote Sens. 2023, 15, 3949. [Google Scholar] [CrossRef]
Zhou, Y.; Chen, K.; Li, X. Dual-Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4208617. [Google Scholar] [CrossRef]
Tang, Y.; Yang, P.; Zhou, Z.; Zhao, X. Daytime Sea Fog Detection Based on a Two-Stage Neural Network. Remote Sens. 2022, 14, 5570. [Google Scholar] [CrossRef]
Lateef, F.; Ruichek, Y. Survey on Semantic Segmentation Using Deep Learning Techniques. Neurocomputing 2019, 338, 321–348. [Google Scholar] [CrossRef]
Huang, B.; Gao, S.; Yu, R.; Zhao, W.; Zhou, G. Monitoring Sea Fog over the Yellow Sea and Bohai Bay Based on Deep Convolutional Neural Network. J. Trop. Meteor. 2024, 30, 223–229. [Google Scholar] [CrossRef]
GEO-KOMPSAT-2A User Readiness Planning. Available online: https://nmsc.kma.go.kr/enhome/html/base/cmm/selectPage.do?page=satellite.gk2a.userReadinessInformation (accessed on 1 November 2025).
GEO-KOMPSAT-2B. Available online: https://nosc.go.kr/eng/boardContents/actionBoardContentsCons0024.do (accessed on 1 November 2025).
Kim, M.; Park, M.-S. The GOCI-II Early Mission Marine Fog Detection Products: Optical Characteristics and Verification. Korean J. Remote Sens. 2021, 37, 1317–1328. [Google Scholar] [CrossRef]
KARI—Geostationary Satellite. Available online: https://www.kari.re.kr/eng/contents/165 (accessed on 1 November 2025).
Jin, K.; Yang, K.; Choi, J. Image Radiometric Quality Assessment of the Meteorological Payload on GEO-KOMPSAT-2A. Aerosp. Eng. Technol. 2013, 12, 30–39. [Google Scholar]
Han, H.-J.; Yang, H.; Heo, J.-M.; Park, Y.-J. Systemic Design and Development of the Second Geostationary Ocean Color Satellite Ground Segment. KIISE Trans. Comput. Pract. 2019, 25, 477–484. [Google Scholar] [CrossRef]
Cho, Y.-K.; Kim, M.-O.; Kim, B.-C. Sea fog around the Korean Peninsula. J. Appl. Meteorol. 2000, 39, 2473–2479. [Google Scholar] [CrossRef]
Zhang, S.-P.; Xie, S.-P.; Liu, Q.-Y.; Yang, Y.-Q.; Wang, X.-G.; Ren, Z.-P. Seasonal variations of Yellow Sea fog: Observations and mechanisms. J. Clim. 2009, 22, 6758–6772. [Google Scholar] [CrossRef]
Zitova, B.; Flusser, J. Image Registration Methods: A Survey. Image Vis. Comput. 2003, 21, 977–1000. [Google Scholar] [CrossRef]
Edstedt, J.; Sun, Q.; Bökman, G.; Wadenbäck, M.; Felsberg, M. RoMa: Robust Dense Feature Matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–24 June 2024; pp. 19790–19800. [Google Scholar] [CrossRef]
Yuan, Y.; Chen, X.; Wang, J. Object-Contextual Representations for Semantic Segmentation. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Volume 12354. pp. 173–190. [Google Scholar] [CrossRef]
Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
Guo, M.-H.; Lu, C.-Z.; Hou, Q.; Liu, Z.; Cheng, M.-M.; Hu, S.-M. SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation. In Proceedings of the Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022; pp. 1140–1156. [Google Scholar]
Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Proceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS 2021), Virtual, 6–14 December 2021; pp. 12077–12090. [Google Scholar]
Liu, Z.; Lin, Y.; Wu, Y.; Xie, S.; Yu, W.; Zhou, Q.; Tokuda, I.; Zhang, M.; Wang, M. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-Attention Mask Transformer for Universal Image Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1290–1299. [Google Scholar]

Figure 1. The overall research flow is divided into image labeling, image co-registration, and model construction.

Figure 2. Spatial coverage of GK2A AMI and GK2B GOCI-II over the study region. The red outline indicates the GK2A AMI local-area image used in this study, and the numbered polygons represent the 12 GOCI-II observation slots covering the analysis domain.

Figure 3. Examples of imagery facilitating sea fog identification for sea fog labeling over the Korean Peninsula and adjacent seas. (a) AMI FCC imagery, (b) GOCI-II FCC imagery, (c) AMI CTH and in situ overlay imagery with color bars, (d) annotation with legend.

Figure 4. Monthly distribution of the selected fog-containing scenes used in this study.

Figure 5. Illustration of the RoMa model process. Adapted from [42].

Figure 6. Example of image registration overlaid on the coastline provided by the Korea Hydrographic and Oceanographic Agency: (a) AMI image before registration, (b) GOCI-II image before registration, and (c) GOCI-II image after registration. The solid white line represents the KHOA coastline, serving as a reference for the visual assessment of co-registration accuracy.

Figure 7. FCC and labeled images for the test dataset, along with the inference results from each detection model. TN, TP, FP, and FN are indicated in black, white, orange, and red, respectively.

Figure 8. FCC images from AMI and GOCI-II, fog products from AMI and GOCI-II, and annotation and model inference images. Fog pixels are represented in white across the product, annotation, and model inference panels.

Figure 9. Comparisons of sea fog detection using AMI-only and AMI + GOCI-II. TN, TP, FP, and FN are indicated in black, white, orange, and red, respectively.

Figure 10. True color images, false-color composite (FCC) images, and inference results on sea fog days. Location labels indicate the main regions referenced in the text (e.g., West Coast and Jeju Island). The red outline indicates the coastline used for geographic reference.

Figure 11. Example of the worst result of sea fog detection by the Swin Transformer model. TN, TP, FP, and FN are indicated in black, white, orange, and red, respectively.

Table 1. Spectral bands of GK2A AMI and GK2B GOCI-II.

Satellite/ Sensor	Channel No.	Central Wavelength (μm)	Band Type	Main Application
GK2A/ AMI	Ch. 1	0.47	Blue (VIS)	Aerosol, haze detection
	Ch. 2	0.51	Green (VIS)	Ocean and land surface, vegetation monitoring
	Ch. 3	0.64	Red (VIS)	Cloud/sea fog detection, vegetation index
	Ch. 4	0.86	NIR	Cloud-top structure, vegetation monitoring
	Ch. 5	1.38	NIR	Thin cirrus detection
	Ch. 6	1.60	SWIR	Cloud/sea fog discrimination, snow/ice detection
	Ch. 7	2.25	SWIR	Cloud phase detection
	Ch. 8	3.90	TIR	Low-level cloud/sea fog detection, fire detection
	Ch. 9	6.20	TIR	Upper-level water vapor
	Ch. 10	6.90	TIR	Mid-level water vapor
	Ch. 11	7.30	TIR	Lower-level water vapor
	Ch. 12	8.60	TIR	Cloud phase, aerosol
	Ch. 13	9.60	TIR	Ozone detection
	Ch. 14	10.40	TIR	Cloud/sea fog brightness temperature
	Ch. 15	11.20	TIR	Sea fog and low-level cloud detection
	Ch. 16	12.30	TIR	Atmospheric moisture, cloud microphysics
GK2B/ GOCI-II	B1	0.412	VIS	Dissolved organic matter, ocean color detection
	B2	0.443	VIS	Chlorophyll, sea fog detection
	B3	0.490	VIS	Ocean color, chlorophyll a
	B4	0.555	VIS	Standard ocean color index
	B5	0.620	VIS	Suspended sediments, dissolved matter detection
	B6	0.660	VIS	Red tide, chlorophyll detection
	B7	0.680	VIS	Phytoplankton fluorescence
	B8	0.709	VIS	Red tide, eutrophication monitoring
	B9	0.745	NIR	Sea fog/cloud detection, sea surface reflection correction
	B10	0.865	NIR	Atmospheric correction, sea fog detection
	B11	1.375	NIR	Thin cirrus detection
	B12	1.610	SWIR	Cloud/sea fog discrimination, snow detection

Table 2. Overview of annotation images.

Item	Description
Annotation period	January 2023–July 2024
Acquisition time	08:00–12:00 KST (hourly)
Number of acquisition days	24 days
Number of annotated images	86 scenes
Annotated image size	8000 × 6000 pixels
Pixel ratio (sea fog vs. non-sea fog)	4% vs. 96%

Table 3. Hyperparameter ranges to explore via Optuna.

Hyperparameter	Range
Trial	5
Learning rate	1 × 10⁻⁵–1 × 10⁻³
Batch size	2, 4, 8
Dropout ratio	0.1–0.5
Weight decay	1 × 10⁻²–1 × 10⁻¹
Background class weight	0.5–1.0

Table 4. Hyperparameters as the result of Optuna autotuning.

Model	Learning Rate	Batch Size	Drop Path Rate	Weight Decay	Background Weight
OCRNet	4 × 10⁻⁴	8	0.301	1 × 10⁻²	0.78
ConvNeXt	2 × 10⁻⁴	4	0.238	2 × 10⁻²	0.8
SegNeXt	2 × 10⁻⁵	4	0.444	2 × 10⁻²	0.89
SegFormer	2 × 10⁻⁴	4	0.148	2 × 10⁻²	0.79
Swin Transformer	4 × 10⁻⁵	2	0.274	4 × 10⁻²	1.00
Mask2Former	2 × 10⁻⁵	2	0.435	1 × 10⁻²	0.69

Table 5. Evaluation metrics for sea fog detection models using co-registered AMI and GOCI-II images. Best scores are in bold.

Category	Model	Backbone	Head	IoU	Accuracy	Precision	FAR	Recall	F1-Score
CNN based Models	OCRNet	HRNet-W48	OCRNet	71.194	98.390	81.766	18.234	84.630	83.173
	ConvNext	ConvNeXt-Large	UPerNet	75.605	98.632	82.381	17.619	90.189	86.108
	SegNeXt	MASAN-Large	SegNeXt	76.110	98.619	80.277	19.723	93.616 ¹	86.435
Transformer based Models	SegFormer	MIT-B4	SegFormer	72.049	98.362	78.452	21.548	89.825	83.754
	Swin Transformer	Swin-Large	UPerNet	77.242 ¹	98.726 ¹	82.796 ¹	17.204 ¹	92.009	87.160 ¹
	Mask2Former	Swin-Large	Mask2Former	76.118	98.621	80.360	19.640	93.515	86.440

¹ The highest scores across all models.

Table 6. Swin Transformer’s performance across the slots.

Slot	Major Region (Approx.)	No. of Patches	IoU	Accuracy	Precision	Recall	F1-Score
S001	Southern Japanese Waters	10	49.050	99.149	49.439	98.423	65.817
S002	Eastern Japanese Waters	10	51.746	95.790	58.331	82.090	68.200
S004	Central-Southern East Sea and East Coast	10	81.899	99.107	84.158	96.826	90.049
S005	Northeastern East Sea and Northwestern Japan	10	76.488	95.294	79.223	95.681	86.678
S007	Korean Peninsula	10	92.114	98.554	94.612	97.214	95.895
S008	Northern East Sea and Southern Primorsky Krai	10	66.240	98.025	94.658	68.812	79.692
S010	Southern Bohai Sea and Eastern China	10	89.658	99.270	92.970	96.179	94.547

Table 7. Performance comparison when using only AMI and when using both AMI and GOCI-II.

Model	IoU	Accuracy	Precision	FAR	Recall	F1-Score
Only AMI	76.046	98.685	84.130	15.870	88.782	86.393
AMI + GOCI-II	77.242	98.726	82.796	17.204	92.009	87.160

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kang, J.; Miyazaki, H.; Kim, S.H.; Kafatos, M.; Kim, D.; Kim, J.; Lee, Y. Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II. Remote Sens. 2026, 18, 34. https://doi.org/10.3390/rs18010034

AMA Style

Kang J, Miyazaki H, Kim SH, Kafatos M, Kim D, Kim J, Lee Y. Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II. Remote Sensing. 2026; 18(1):34. https://doi.org/10.3390/rs18010034

Chicago/Turabian Style

Kang, Jonggu, Hiroyuki Miyazaki, Seung Hee Kim, Menas Kafatos, Daesun Kim, Jinsoo Kim, and Yangwon Lee. 2026. "Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II" Remote Sensing 18, no. 1: 34. https://doi.org/10.3390/rs18010034

APA Style

Kang, J., Miyazaki, H., Kim, S. H., Kafatos, M., Kim, D., Kim, J., & Lee, Y. (2026). Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II. Remote Sensing, 18(1), 34. https://doi.org/10.3390/rs18010034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Multi-Satellite Image Matching and Deep Learning Segmentation for Detection of Daytime Sea Fog Using GK2A AMI and GK2B GOCI-II

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of GK2A and GK2B Satellites

2.2. Labeling Annotation Data for Sea Fog Detection

2.3. Co-Registration for AMI and GOCI-II Satellite Images

2.4. Training Deep Learning Segmentation Models

3. Results

4. Discussion

4.1. Comparisons with Currently Operational Products

4.2. Advantages from Multi-Sensor Image Fusion

4.3. An In-Depth Case Study

4.4. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI