Next Article in Journal
Land Cover Type Classification Using High-Resolution Orthophotomaps and Convolutional Neural Networks: Case Study of Tatra National Park
Previous Article in Journal
SAR and Visible Image Fusion via Retinex-Guided SAR Reconstruction
Previous Article in Special Issue
Ship Detection in SAR Images Using Sparse R-CNN with Wavelet Deformable Convolution and Attention Mechanism
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena in SAR Images Based on Modified Segformer

1
State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
2
Engineering Research Center of Ocean Remote Sensing Big Data, Fujian Province University, Xiamen 361102, China
3
Joint Center for Ocean Remote Sensing, University of Xiamen-Delaware University, Xiamen 361005, China
4
National Key Laboratory of Scattering and Radiation, Beijing 100854, China
5
Fujian Hisea Digital Technology Co., Ltd., Sanming 365001, China
6
College of Earth, Ocean and Environment, University of Delaware, Newark, DE 19716, USA
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(1), 113; https://doi.org/10.3390/rs18010113 (registering DOI)
Submission received: 26 November 2025 / Revised: 22 December 2025 / Accepted: 25 December 2025 / Published: 28 December 2025
(This article belongs to the Special Issue Microwave Remote Sensing on Ocean Observation)

Highlights

What are the main findings?
  • A semantic segmentation dataset covering 12 typical oceanic and atmospheric phenomena is constructed, using 2383 Sentinel-1 WV mode images and 2628 IW mode sub-images with 100 m resolution and 256 × 256 pixels.
  • Our modified Segformer model named Segformer-OcnP (integrating improved ASPP, CA modules, and progressive upsampling), outperforms classic models like U-Net and original Segformer, achieving 80.98% mDice, 70.32% mIoU, and 86.77% OA.
What are the implications of the main findings?
  • The dataset addresses the lack of diverse, multi-phenomenon SAR segmentation data, supporting AI-driven ocean–atmosphere observation research.
  • Segformer-OcnP has improved segmentation accuracy for small-scale and complex phenomena, providing a tool for pixel-level recognition of oceanic and atmospheric processes.

Abstract

Synthetic Aperture Radar (SAR) images of the sea surface reveal a variety of oceanic and atmospheric phenomena. Automatically detecting and identifying these phenomena is essential for understanding ocean dynamics and ocean–atmosphere interactions. This study selected 2383 Sentinel-1 Wave (WV) mode images and 2628 Interferometric Wide swath (IW) mode sub-images to construct a semantic segmentation dataset covering 12 typical oceanic and atmospheric phenomena, with a balanced distribution of approximately 400 sub-images per category, culminating in a comprehensive dataset of 5011 samples. The images in this dataset have a resolution of 100 m and dimensions of 256 × 256 pixels. We propose Segformer-OcnP model based on Segformer for the semantic segmentation of these multiple oceanic and atmospheric phenomena. Experimental results demonstrate that Segformer-OcnP outperforms classic CNN-based models (U-Net, DeepLabV3+) and mainstream Transformer-based models (SETR, the original Segformer), achieving 80.98% mDice, 70.32% mIoU, and 86.77% Overall Accuracy, verifying its superior segmentation performance.

1. Introduction

The exchange of energy and mass between the ocean and the atmosphere influences global water circulation, climate change, and biogeochemical cycles [1,2,3]. Its significant role in the global environment, climate, and ecological balance cannot be overstated. Traditional ocean observations are predominantly based on in-situ observations. However, these methods incur high observation costs and are limited in observation coverage, making it difficult to meet the demand for short-term and large-scale ocean observations [4]. Compared to traditional methods, ocean remote sensing allows for distant, wide-ranging observation of the ocean. Synthetic Aperture Radar (SAR) is an active microwave remote sensing imaging radar, characterized by all-daytime, all-weather, and high-resolution capabilities. Compared to optical satellites, SAR can penetrate clouds, unaffected by weather conditions, making it especially advantageous for observing the ocean surface, particularly in adverse weather conditions. Nowadays, the accumulation of a large number of SAR images has provided a wealth of research data for ocean studies [5,6].
Traditional methods for detecting oceanic and atmospheric phenomena in SAR images primarily rely on feature selection and threshold setting [7,8,9,10]. However, these methods suffer from sensitivity to noise and poor generalization ability. With the application of artificial intelligence in the field of oceanography, researchers have introduced deep learning methods, constructed data-driven models for detecting oceanic and atmospheric phenomena, which extract features of different phenomena more accurately, significantly enhancing the generalization ability of models [4].
Deep learning technology has demonstrated powerful image segmentation capabilities in the field of computer vision and has become a reliable tool for extracting precise pixel objects in SAR images. Deep learning methods have been proposed to automatically extract various oceanic and atmospheric phenomena from SAR images, such as sea surface oil spills, sea ice, ocean eddies, and ocean internal waves [11,12,13,14]. However, deep learning methods are data-driven methods, and numerous studies have highlighted the challenges of creating deep learning datasets, which require significant time and effort [4,15,16]. Additionally, most related research has focused on limited areas or single phenomena [14,17], thereby restricting the scope of holistic multi-phenomena observations across diverse sea-surface conditions.
Fortunately, the first SAR image oceanic and atmospheric phenomena dataset for image classification was released by Wang et al. [18]. This dataset manually selected Sentinel-1 Wave (WV) mode images from 2016, annotated with 10 types of geophysical phenomena: atmospheric fronts, biogenic slicks, icebergs, low wind speed areas, microwave convection cells, ocean fronts, pure sea waves, rainfall cells, sea ice, and wind streaks. The results of the study indicated that deep learning models based on this dataset achieved satisfactory results, with excellent classification performance. However, there are certain limitations when multiple phenomena are present in SAR images. To address the situation where multiple phenomena exist in a single image, Colin et al. compared various fully supervised and weakly supervised methods to segment different oceanic and atmospheric phenomena at the pixel level [19]. The experimental results show that fully supervised frameworks (such as U-Net) consistently outperform weakly supervised methods in segmenting various oceanic and atmospheric phenomena. However, they emphasize that increasing the dataset size could further improve prediction accuracy. Additionally, their study used input images with a spatial resolution of 100 m, but the generated output had a coarser resolution of 400 m, limiting the ability to capture fine structural details. This resolution disparity presents challenges for tasks that require high spatial precision. Their work highlights the potential for expanding the use of WV mode data; the reliance on a dataset composed solely of WV mode images limits data diversity, which is a critical factor in enhancing the robustness of semantic segmentation models.
Therefore, this paper aims to construct a semantic segmentation dataset of SAR images that includes various oceanic and atmospheric phenomena to achieve pixel-level segmentation of SAR images. Specifically, we developed a semantic segmentation model tailored to this task, validated it by comparing it with defined geographic ground truth, and further evaluated the segmentation results of certain phenomena using external data for additional validation.
The paper is organized as follows: Section 2 describes the dataset construction; Section 3 describes the details of modified Segformer model and the training strategy; Section 4 presents the segmentation result and validation and Section 5 shows the summary and discussion.

2. Dataset

2.1. Focused Phenomena

Based on existing research and the classification of oceanic and atmospheric phenomena in SAR images in the TenGeoP-SARwv dataset, we construct a SAR image semantic segmentation dataset comprising 12 types of oceanic and atmospheric phenomena. The TenGeoP-SARwv dataset includes 10 phenomena: Atmospheric Fronts (AF), Oceanic Fronts (OF), Rainfall (RF), Icebergs (IB), Sea Ice (SI), Pure Ocean Waves (POW), Wind Streaks (WS), Low Wind Areas (LWA), Biological Slicks (BS), and Micro Convective Cells (MCC) [18].
In addition, based on frequently observed phenomena in SAR images, we incorporated two new typical oceanic phenomena: Ocean Internal Waves (IWs) and Ocean Eddies (Eddy). Ocean internal waves are typically formed within the ocean’s stable density stratification and can significantly modulate the sea surface roughness. This modulation manifests in SAR images as alternating bright and dark strips [20]. Oceanic eddies change sea surface roughness by carrying tracers (such as biological slicks and sea ice) or affecting surface flow fields, creating distinct elliptical patches or bands on SAR images. Depending on their formation mechanisms, they are primarily categorized as “dark eddies” and “white eddies” [21]. In this study, our primary focus is on “black eddies.” All the phenomena focused on in this paper are presented in Figure 1.

2.2. Dataset Construction

We used Sentinel-1 Interferometric Wide swath (IW) mode and WV mode data to construct the dataset. The IW mode is Sentinel-1’s primary acquisition mode for land and coastal regions, while the WV mode is mainly used for open ocean areas. Colin et al. have demonstrated the feasibility of semantic segmentation using WV mode images, as well as the potential for transferring this application to IW mode segmentation [19]. To improve the model’s applicability and training data diversity, we constructed the dataset using both Sentinel-1 IW and WV mode images.
For WV mode images, we incorporate the TenGeoP-SARwv dataset. Since this dataset is an image classification dataset with only one label per image, it cannot be directly used for pixel-level semantic segmentation tasks. Therefore, we select 2383 WV mode images from the dataset for semantic segmentation annotation. Additionally, we reference the annotations proposed by Colin to further enhance the accuracy and reliability of the annotations.
For IW mode images, we constructed the dataset using a total of 484 Sentinel-1 IW mode Ground Range Detected (GRD) products acquired globally from 2015 to 2022. Among them, we incorporated 156 images from the internal wave object detection dataset proposed by Tao et al. [22], which prominently feature internal wave characteristics, thereby enhancing the accuracy of internal wave annotations.
To maintain consistency with the processing of WV-mode images, we applied a preprocessing approach similar to that of TenGeoP-SARwv [23] to the selected IW images. However, since IW-mode images are often acquired near coastlines and are affected by land, we performed land–sea segmentation to eliminate land interference. The preprocessing workflow for IW mode images is illustrated in Figure 2. Radiometric calibration converted the digital number values of SAR images to normalized backscatter coefficients(σ0) using SigmaNaught information from XML files, mitigating the impacts of radar system parameters and imaging environments to highlight the inherent properties of targets; this was followed by down-sampling, which uniformly resamples Sentinel-1 images to a 100 m resolution where the higher spatial resolution aids in capturing the fine structural characteristics of oceanic and atmospheric phenomena. Subsequently, the sea–land mask utilizes the Global Self-consistent Hierarchical High-resolution Shorelines (GSHHS) database to mask land areas in coastal IW mode images, eliminating interference from land and enhancing ocean contrast for the accurate identification of marine and atmospheric phenomena. Further, re-calibration applied the CMOD5.N geophysical model function (GMF) to adjust σ0, reducing the influence of radar incidence angles as sea surface roughness (SSR) and corresponding backscatter tend to decrease with increasing incidence angles in C-band VV-polarized SAR images. Finally, normalization employed percentile-based methods to eliminate outliers and enhance target contrast.
To construct a dataset suitable for semantic segmentation tasks, we employed a sliding window approach to crop the normalized 8-bit and 16-bit images with a resolution of 100 m into non-overlapping 256 × 256 sub-images. The 8-bit images were used for visual interpretation and annotation, while the 16-bit images were used for model training. Each type of oceanic and atmospheric phenomenon is represented by approximately 400 sub-images. The data distribution is shown in Figure 3. The left figure shows the geographical distribution of images in the dataset, while the table on the right presents the number of available images for each category.
The cropped sub-images were then annotated using the Labelme software based on the original SAR images. For the 10 oceanic and atmospheric phenomena with existing research, we referred to the guidelines provided by Benchaabane et al. [24] and Wang et al. [23] to ensure labeling accuracy. For the two newly added oceanic phenomena, ocean eddies and ocean internal waves, we established segmentation standards based on relevant literature. Specifically, for ocean eddies, the minimum enclosing shape is adopted as the ground truth label; notably, since biological slicks often act as tracers for ocean eddies [25,26,27], the eddy phenomenon is prioritized over biological slicks for annotation in cases of overlap. For oceanic internal waves, which appear in SAR images as irregular alternating light and dark stripes, we referenced publicly accessible object detection datasets [22] to inform the annotation process, thereby guaranteeing the reliability of labeling for this phenomenon. For areas on the ocean surface with no obvious phenomena, we mark them as Background (BG). Additionally, to mitigate the impact of ships and offshore wind turbines on the segmentation results of oceanic and atmospheric phenomena, this dataset also includes classifications for artificial objects.
After annotation, the generated JSON files were converted into PNG format to create the dataset labels. Figure 4 presents selected images and their labels from the dataset. Among them, rows a and c represent the original images, while rows b and d correspond to the semantic segmentation annotations for those images. Among them, a1 includes IWs and BG; a2 includes RF and MCC; a3 includes LWA and BS; a4 includes POW and IB; c1 includes POW, AF, and MCC; c2 includes BS and Eddy; c3 includes POW and OF; and c4 includes SI.
Once the image labels were obtained, all SAR images of different categories and imaging modes were randomly divided into training, validation, and test sets in an 8:1:1 ratio. This process resulted in a total of 5011 experimental data samples, including 4036 training images, 483 validation images, and 492 test images. The test set is entirely independent of the training and validation sets.

3. Methodology

3.1. Segformer-OcnP

Semantic segmentation of oceanic and atmospheric phenomena in SAR images is challenged by the multi-scale nature of targets, blurred inter-class boundaries, and complex background interference. To address these challenges, we propose the Segformer-OcnP model built on the Segformer architecture [28]. The structure of Segformer-OcnP network is shown in Figure 5.

3.1.1. Encoder

Aiming at semantic segmentation tasks, the original Segformer is a transformer-based framework that incorporates a hierarchical encoder composed of Mix Vision Transformers (MiTs), and this encoder exhibits prominent capabilities in extracting global contextual features as well as the ability to generate multi-level feature maps. We leverage this robust architectural backbone while tailoring the component configurations to accommodate the unique radiometric characteristics of SAR imagery. The encoder of Segformer-OcnP is constructed with multiple Transformer Blocks. These blocks work in a synergistic manner to process input SAR images layer by layer: they first capture fine-grained local texture details of oceanic and atmospheric phenomena and then gradually aggregate high-level global spatial distribution information. Through this layered processing mechanism, the encoder generates multi-scale feature maps that cover different semantic levels, laying a foundation for the segmentation task.

3.1.2. Decoder

Despite the original Segformer good performance in feature extraction, its decoder relies on simple Multilayer Perceptron (MLP) layers. These simple layers are incompetent in accurately restoring detailed information from multi-scale feature maps, which ultimately leads to subpar segmentation results, especially in handling small-scale targets and blurred boundaries of oceanic and atmospheric phenomena. To make up for this defect, we enhance the original Segformer decoder by incorporating a modified Atrous Spatial Pyramid Pooling (ASPP) module [29], a Coordinate Attention (CA) module [30], and adopting a progressive upsampling approach to fuse feature maps of different scales.
For the modified Atrous Spatial Pyramid Pooling (ASPP) module proposed in this study, we specifically set the dilation rates of the atrous convolutions to [1,2,5,6], a deliberate design choice aimed at two core objectives. First, this zigzag arrangement of dilation rates deliberately avoids common factor relationships between adjacent values, which effectively mitigates the well-known “grid effect” [31]. This phenomenon leads to discontinuous feature sampling and low pixel utilization in feature maps when dilation rates share common divisors. Second, the multi-level rate combination covers both small and large scales: the smaller rates are dedicated to capturing fine-grained local details of small-scale oceanic and atmospheric phenomena, while the larger rates excel at extracting large-scale contextual information of macro phenomena, thus comprehensively accommodating the diverse scale characteristics of the research objects. Additionally, the global average pooling layer in the ASPP has certain limitations, as it cannot fully capture the feature information of oceanic and atmospheric phenomena with varying shapes. Therefore, in this study, we added a Mixed Pooling Model (MPM) [32] in the ASPP structure, which combines different pooling methods to effectively capture both short-range and long-range dependencies in the feature maps. This hybrid framework enables the module to not only capture short-range local feature dependencies through fine-grained pooling operations but also effectively model long-range contextual correlations across the entire feature map, thereby making up for the deficiency of global average pooling in shape-aware feature extraction and enhancing the representational capability of the module for complex oceanic and atmospheric features. The process is as follows: (Fn represents a feature map, D i l C o n v r represents a dilated convolution with an inflation rate of r, represents feature concatenation)
F a s p p = D i l C o n v 1 ( F 5 ) D i l C o n v 2 ( F 5 ) D i l C o n v 5 ( F 5 ) D i l C o n v 6 ( F 5 ) M P M ( F 5 )
For the upsampling module, we employ a progressive upsampling method similar to that used in U-Net to fuse the four different scale feature maps extracted by Segformer. This approach allows the network to better utilize contextual information and reduces the information loss that occurs with direct upsampling in the original Segformer network [33]. The process is as follows: (Upsample (F, s) represents upsamples feature map F by a factor of s).
F u p 1 = U p s a m p l e ( F 4 , 2 ) F 3
F u p 2 = U p s a m p l e ( F u p 1 , 2 ) F 2
F u p 3 = U p s a m p l e ( F u p 2 , 2 ) F 1
F u p = U p s a m p l e ( F u p 3 , 1 )
Furthermore, we introduce a CA module to enhance feature map fusion, thereby improving the segmentation capability of the target region. Finally, after adjusting the number of channels through 1 × 1 convolution and upsampling, we obtain a feature map with the same size and dimensions as the original Segformer output, but with richer information.

3.2. Training Strategy

All experiments were carried out on an NVIDIA GeForce RTX 3090 GPU by using the PyTorch 2.0.1 framework. Regarding hyperparameter settings, the batch size for all experimental models is set to 16, and the models are trained for 80,000 iterations. A two-stage learning rate scheduling strategy is adopted, where LinearLR with a start factor of 1 × 10−6 is used for the first 1500 iterations to achieve learning rate warm-up, and after 1500 iterations, the scheduler switches to PolyLR until the end of 80,000 iterations to balance convergence speed and stability. The AdamW optimizer is configured with betas of 0.9 and 0.999, and a paramwise weight decay strategy is employed where the learning rate of the decode head is scaled by 10 and no decay is applied to normalization layers and position blocks.
Due to significant differences in pixel distribution among various phenomena, there is a distinct pixel imbalance in the dataset. This study adopts a multi-loss function fusion strategy to train the deep learning model, specifically combining weighted cross-entropy loss, Dice loss, and Focal loss. The weight coefficients of these three losses were all set to 1.0, with the core basis lying in their functional complementarity and adaptability to the task. Additionally, we assign specific class weights to each phenomenon, which are set as the reciprocal of the pixel-level prior probability of the corresponding phenomenon, thereby further mitigating the adverse impact of pixel imbalance on model training.
We used the Dice coefficient (Dice) and Intersection over Union (IoU) as metrics to evaluate the positional differences between each phenomenon and the ground truth labels. Overall Accuracy (OA) represents the proportion of correctly classified pixels in the images. The calculation formula is shown below, where X is the predicted result and Y is the true label.
O A = T P + T N T P + F P + F N + T N
I o U = X Y X Y = T P T P + F N + F P
D i c e ( X , Y ) = 2 | X Y | | X | + | Y | = 2 T P 2 T P + F N + F P
Because oceanic and atmospheric phenomena influenced by different environmental factors have varying characteristics, a large and diverse dataset is crucial for achieving good segmentation results. Data augmentation methods can effectively expand our dataset, enhancing its diversity and improving the network’s segmentation capabilities. Therefore, in this study, we augment the original dataset with several techniques, including horizontal and vertical flipping, image rotation, and photometric distortion.

4. Results and Validation

4.1. Ablation Experiments Results

To verify the role of each module in the Segformer-OcnP, ablation experiments are conducted on the test set in this section. All experiments adopt the mean Dice coefficient, mean IoU, and Overall Accuracy (OA) for quantitative analysis, aiming to verify the segmentation performance of the combined loss function, improved ASPP module, MPM module, CA attention modules, and progressive upsampling, respectively.
Table 1 presents the results of ablation experiments verifying the effectiveness of each improved module in the model, with performance evaluated by three metrics mDice mIoU and OA. Exp0 serves as the baseline using the original Segformer with cross entropy loss, achieving mDice 78.02 percent mIoU 67.17 percent and OA 85.26 percent. Exp1 adopts Segformer with combined loss which slightly improved mDice to 78.83 percent and mIoU to 68.08 percent while OA remained stable at 85.20 percent. Exp2 builds on Exp1 by adding ASPP raising mDice to 79.46 percent mIoU to 68.59 percent and OA to 85.39 percent. Exp3 further incorporates MPM into Exp2 resulting in mDice 79.92 percent mIoU 69.18 percent and OA 85.45 percent. Exp4 introduces CA to Exp3 leading to a significant increase in OA to 86.41 percent alongside mDice 80.31 percent and mIoU 69.71 percent. Exp5 integrates progressive upsampling on the basis of Exp4 achieving the optimal performance with mDice 80.98 percent mIoU 70.32 percent and OA 86.77 percent.
The experimental results demonstrate that each module added in this study contributes to a certain improvement in the segmentation of typical oceanic and atmospheric phenomena in SAR images.

4.2. Overall Evaluation Results

To compare the segmentation performance of different models, we select four classic semantic segmentation models for comparative experiments. To ensure the fairness of the experiments, all networks are trained in the same hardware environment and obtain segmentation results based on the same training set, validation set, and test set. Table 2 presents the segmentation results of each model on the test set.
Segformer-OcnP model demonstrates the best performance, with scores of 80.98% (Dice), 70.32% (IoU), and 86.77% (Accuracy). Compared to U-Net, the proposed model improves the Dice score by 8.67%, IoU by 11.03%, and accuracy by 7.76%. Compared to the baseline Segformer, the proposed model enhances the average Dice score by 2.15%, IoU by 2.24%, and accuracy by 1.57%. The visual inspection of the segmentation results is given in Figure 6. In this, row (a) shows the original images, (b) represents the ground truth, and rows (c) to (g) display the segmentation results from U-Net, DeepLabV3+, SETR, Segformer, and Segformer-OcnP, respectively. It confirms that the most promising method is Segformer-OcnP. These observations are consistent with the segmentation results values.
The results indicate that when multiple phenomena coexist in SAR images, U-Net and DeepLabV3+ frequently exhibit numerous misclassifications, unclear boundary segmentation, and severe image distortion. Additionally, due to the limitations of the receptive field of CNNs, these models cannot accurately identify large-scale, long-distance phenomena, resulting in lower segmentation accuracy. They also fail to delineate the contours of small-scale phenomena such as icebergs and ocean fronts. In contrast, the Transformer-based models SETR and Segformer show some improvement over the aforementioned models, achieving relatively accurate boundary recognition for various phenomena. However, they still encounter segmentation errors for small-scale and complex-featured phenomena. Notably, Segformer-OcnP model proposed in this study demonstrates superior segmentation capabilities for oceanic and atmospheric phenomena. It accurately segments the boundaries of different phenomena in complex scenes, improves the segmentation of small-scale phenomena with clear contours, and has a lower false alarm rate, producing results that closely align with the ground truth labels.

4.3. Segmentation Results for Different Phenomena

Table 3 presents the Dice coefficients for the segmentation results of twelve oceanic and atmospheric phenomena across five networks, with the best segmentation results for each phenomenon highlighted in bold. The results show that Segformer-OcnP model proposed in this study achieves the best segmentation results for eight phenomena: ocean fronts, rainfall, icebergs, sea ice, pure ocean waves, wind streaks, ocean internal waves, and Eddies. Compared to the baseline Segformer network, the detection of ocean fronts, icebergs, and eddies shows significant improvement, with Dice coefficients increasing by 3.4%, 9.53%, and 6.28%, respectively, demonstrating the enhanced capability of Segformer-OcnP network in extracting small targets and learning complex features. Although atmospheric fronts, biological slicks, low wind speed areas, and micro-convective cells did not achieve the best segmentation results, the differences from the best results are minimal, at 0.07%, 0.17%, 0.06%, and 0.04%, respectively, which are within an acceptable range. Overall, Segformer-OcnP model exhibits the best comprehensive segmentation performance.
Among the twelve typical oceanic and atmospheric phenomena, large-scale ocean phenomena such as rainfall, sea ice, pure ocean waves, wind streaks, low wind speed areas, honeycomb convection, ocean internal waves, and biological slicks have relatively distinct features, resulting in better segmentation outcomes with Dice coefficients exceeding 80% for all five networks. For phenomena with complex features, such as atmospheric fronts and ocean eddies, the models exhibit some misclassification. Atmospheric fronts have multiple distinct features [36], making it difficult for Segformer-OcnP model to learn all characteristics from a limited dataset. Ocean eddies present various forms due to different formation mechanisms, such as “black eddies” and “white eddies” [21], adding to the segmentation challenge due to their feature diversity. The segmentation results for icebergs were the lowest among the five networks, primarily due to the small size of icebergs, which often occupy only a few pixels in the image. Additionally, the low contrast between icebergs and the background further increases the difficulty for the model to accurately segment them. Furthermore, Sentinel-1 IW mode images are collected near the coast, where artificial structures like ships, which share similar characteristics with icebergs, are present, leading to a higher misclassification.
Notably, each network demonstrates excellent recognition capability for sea ice, with Dice coefficients exceeding 95%. This high accuracy is primarily due to the large coverage area of SAR images and the distinct characteristics of sea ice. Additionally, this study focuses on the basic features of sea ice without classifying its types, making the image segmentation task similar to an image classification task.

4.4. Comparison with Visual Interpretation Results

To validate the segmentation performance of Segformer-OcnP model on full Sentinel-1 IW mode images, we select Setinel-1 IW mode image containing multiple phenomena for testing. First, the original image undergoes preprocessing steps. The preprocessing results are shown in Figure 7. It is clear that the SAR image primarily contains three oceanic and atmospheric phenomena: low wind speed areas, biological slicks, and micro-convective cells. Additionally, small-scale ocean eddies are present, as indicated by the red boxed sub-image area in Figure 7b,c. The entire SAR image is then divided into 256 × 256 sub-images with a certain overlap rate and input into t Segformer-OcnP model for testing. The segmentation results are shown in Figure 8. The results clearly display the three primary oceanic and atmospheric phenomena in the SAR image: the red mask represents low wind speed areas, the yellow mask represents biological slicks, and the purple mask represents micro-convective cells (Figure 8b,c). Segformer-OcnP model also successfully identifies and segments smaller-scale ocean eddies, as indicated by the brown mask areas, achieving accurate segmentation results.
To validate the segmentation performance of Segformer-OcnP model on Sentinel-1 WV mode images, we select several WV mode images containing typical oceanic and atmospheric phenomena for testing. As shown in Figure 9, rows a and c represent the original images, while rows b and d correspond to the semantic segmentation results for those images, the segmentation results demonstrate that for large-scale phenomena such as pure ocean waves (a1), wind streaks (3), rainfall (c1), low wind speed areas (c2), biological slicks (c4), and sea ice (c3), the proposed model accurately identifies and segments specific regions. Additionally, the model also performs well in segmenting smaller-scale oceanic and atmospheric phenomena within the images, such as icebergs (a4), ocean fronts (a2), and ocean eddies (c4).

4.5. Case Study

In SAR images, different phenomena often overlap, which poses a significant challenge for semantic segmentation tasks. We selected ocean internal waves and rainfall (two typical oceanic and atmospheric phenomena) for external data validation, as they cause significant changes in sea surface roughness and have a lower likelihood of overlapping with other phenomena. This ensures clearer visual interpretation and more reliable segmentation result evaluation.

4.5.1. Oceanic Internal Wave

We use the oceanic internal wave object detection dataset [37] to validate the segmentation performance of the Segformer-OcnP model on ocean internal waves. Since our dataset construction references part of the dataset from Tao et al., to avoid data overlap, an additional set of images was selected. These images were captured on 7 December 2017, and 9 April 2020, in the Celebes Sea region. As shown in Figure 10, the green boxed areas in the images indicate the object detection annotations for ocean internal waves.
First, the original Sentinel-1 IW mode images were downloaded and reprocessed. The preprocessing results are shown in Figure 11. Next, the entire IW mode image was cropped into sub-images with a certain overlap rate and input into Segformer-OcnP model for testing to obtain the final segmentation results. Figure 12 shows the overlay of the internal wave segmentation results with the original images, where the colored areas represent the internal wave segmentation results.
Comparing the object detection annotations in Figure 10 with the segmentation results in Figure 12, Segformer-OcnP model can clearly and accurately extract internal wave stripes from complex oceanic and atmospheric phenomena, with the extraction results consistent with the object detection labels. Additionally, the segmentation results reveal extra internal wave stripes, as indicated by the red boxes in the figure. This demonstrates that Segformer-OcnP model not only can clearly segment large-scale internal wave stripes but also performs well in segmenting group internal wave stripes.

4.5.2. Rainfall

In this section, Segformer-OcnP model’s segmentation results for rainfall phenomena are validated using IMERG data [38], a Level 3 product from the GPM satellite. IMERG integrates and interpolates microwave precipitation estimates, infrared precipitation estimates, and ground truth data to produce precipitation products with a temporal resolution of 0.5 h and a spatial resolution of 0.1°. Notably, we selected GPM data with a temporal resolution of 0.5 h, including the capture time, which inevitably introduces discrepancies with the rainfall areas observed in SAR images. This study extracts rainfall data from IMERG to characterize rainfall areas and compare them with the model’s segmentation results.
We select two typical IW mode image cases of Sentinel-1 containing rainfall phenomena, which were taken in the Mediterranean area and the sea near Singapore on 29 October 2022 and 30 October 2022, respectively. The preprocessing results are shown in Figure 13. As can be clearly seen from the figure, both selected SAR images exhibit observable rainfall phenomena. After preprocessing, the images were cropped into sub-images and input into Segformer-OcnP model to obtain the segmentation results for the rainfall phenomena. The overlay of the segmentation results with the original images is shown in Figure 14, where the colored areas represent the model’s rainfall segmentation results. From the perspective of visual interpretation, Segformer-OcnP model accurately segmented the rainfall phenomena over the ocean surface.
The comparison between the model segmentation results and the GPM rainfall data is shown in Figure 15 and Figure 16. The rainfall areas identified by the GPM data closely correspond to the rainfall areas in the segmentation results. The slight differences observed may be due to the GPM IMERG product providing average rainfall data over a 0.5-h period, during which the rainfall areas can change over time. In conclusion, Segformer-OcnP model proposed in this study can accurately identify rainfall areas over the ocean, demonstrating robust segmentation performance.

5. Discussion

While ocean–atmosphere interactions in SAR imagery are well-documented, automated detection and classification within large-scale datasets remain significant challenges. This difficulty stems from the inherent complexity of these processes, whose subtle, context-dependent features are often indistinguishable using traditional rule-based or shallow learning approaches.
This paper focuses on the semantic segmentation task, which aims to classify each pixel in an image to segment different phenomena. This method assumes that each pixel belongs to a mutually exclusive category. However, in practice, different categories occasionally overlap, meaning a single pixel may at times belong to multiple categories, increasing the complexity of segmentation. Introducing new categories that encompass mixed oceanic and atmospheric phenomena can help alleviate this issue to some extent. Additionally, the dataset consists of images with a resolution of 100 m and dimensions of 256 × 256 pixels, covering relatively small areas (25 km × 25 km for IW mode sub-images and 20 km × 20 km for WV mode sub-images), which limits segmentation accuracy for large-scale phenomena. Balancing training image size and model performance is crucial for addressing this challenge. Employing multi-scale segmentation methods can aid in segmenting larger-scale phenomena.

6. Conclusions

In this study, we constructed a SAR semantic segmentation dataset using Sentinel-1 IW and WV mode data, encompassing the broadest range of oceanic and atmospheric phenomena. The dataset includes twelve typical phenomena: atmospheric fronts, ocean fronts, rainfall, icebergs, sea ice, pure ocean waves, wind streaks, low wind area, biogenic slicks, microwave convection cells, ocean internal waves, and ocean eddies. This dataset enables researchers to automatically detect various phenomena in SAR images while evaluating the segmentation performance of different models. Furthermore, the availability of the dataset has the potential to accelerate the application of deep learning methods in SAR-based ocean observation.
To address the varying scales and complex features of ocean–atmosphere phenomena, we developed Segformer-OcnP by enhancing the baseline architecture with a modified ASPP module, CA module, and progressive upsampling. Comparative experiments demonstrate that Segformer-OcnP outperforms various classical networks with superior metrics of 80.98% mDice, 70.32% mIoU, and 86.77% OA. Validation through visual interpretation and case studies confirms that the model’s segmentation outputs align closely with professional analysis and external reference data.
The datasets and models presented in this study offer enhanced capabilities for process identification, serving as a robust baseline to support related academic and technical research. Future research will expand the dataset by incorporating additional phenomenon types and multi-source data to develop a marine foundation model. Meanwhile, the model will be optimized to address overlapping phenomena and multi-scale targets, thereby enhancing segmentation robustness in complex ocean environments. Furthermore, the proposed framework will be integrated with SAR data and physical oceanographic models, aiming to achieve dynamic monitoring of ocean–atmosphere interactions, improve the model’s interpretability in practical applications, and enhance the prediction accuracy of model outputs.

Author Contributions

Conceptualization, Q.L.; writing—original draft preparation, Q.L.; writing—review and editing, X.B., L.H., X.G. and X.-H.Y.; visualization, Q.L., L.L. and Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Major Science and Technology Project of Fujian Province under Grant 2024YZ040025 and Xiamen Natural Science Foundation Project under Grant 3502Z202573006.

Data Availability Statement

The dataset constructed in this paper can be downloaded from: https://doi.org/10.5281/zenodo.11410662 (accessed on 26 November 2025) [39]. This dataset contains SAR images, Json files and PNG annotations. Researchers who need the model code of this study for academic research can contact the corresponding author (gengxp@xmu.edu.cn).

Acknowledgments

The authors would like to thank Fujian Hisea Digital Technology Co., Ltd. and Fujian Satellite Data Development Co., Ltd.

Conflicts of Interest

Author Yaohui Bao was employed by the company Fujian Hisea Digital Technology Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Deike, L. Mass Transfer at the Ocean–Atmosphere Interface: The Role of Wave Breaking, Droplets, and Bubbles. Annu. Rev. Fluid Mech. 2022, 54, 191–224. [Google Scholar] [CrossRef]
  2. Held, I.M. The Partitioning of the Poleward Energy Transport between the Tropical Ocean and Atmosphere. J. Atmos. Sci. 2001, 58, 943–948. [Google Scholar] [CrossRef]
  3. Holland, J.Z.; Rasmusson, E.M. Measurements of the Atmospheric Mass, Energy, and Momentum Budgets Over a 500-Kilometer Square of Tropical Ocean. Mon. Weather Rev. 1973, 101, 44–55. [Google Scholar] [CrossRef]
  4. Li, X.; Liu, B.; Zheng, G.; Ren, Y.; Zhang, S.; Liu, Y.; Gao, L.; Liu, Y.; Zhang, B.; Wang, F. Deep-learning-based information mining from ocean remote-sensing imagery. Natl. Sci. Rev. 2020, 7, 1584–1605. [Google Scholar] [CrossRef]
  5. Asiyabi, R.M.; Ghorbanian, A.; Tameh, S.N.; Amani, M.; Jin, S.; Mohammadzadeh, A. Synthetic Aperture Radar (SAR) for Ocean: A Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 9106–9138. [Google Scholar] [CrossRef]
  6. Ouchi, K.; Yoshida, T. On the Interpretation of Synthetic Aperture Radar Images of Oceanic Phenomena: Past and Present. Remote Sens. 2023, 15, 1329. [Google Scholar] [CrossRef]
  7. Alpers, W.; Huang, W. On the Discrimination of Radar Signatures of Atmospheric Gravity Waves and Oceanic Internal Waves on Synthetic Aperture Radar Images of the Sea Surface. IEEE Trans. Geosci. Remote Sens. 2011, 49, 1114–1126. [Google Scholar] [CrossRef]
  8. Chen, J.; Sun, J.; Yang, J. Typical Ocean Features Detection in SAR Images. In Proceedings of the 2008 International Workshop on Education Technology and Training & 2008 International Workshop on Geoscience and Remote Sensing, Shanghai, China, 21–22 December 2008. [Google Scholar]
  9. Fiscella, B.; Giancaspro, A.; Nirchio, F.; Pavese, P.; Trivero, P. Oil spill detection using marine SAR images. Int. J. Remote Sens. 2000, 21, 3561–3566. [Google Scholar] [CrossRef]
  10. Topouzelis, K.; Kitsiou, D. Detection and classification of mesoscale atmospheric phenomena above sea in SAR imagery. Remote Sens. Environ. 2015, 160, 263–272. [Google Scholar] [CrossRef]
  11. Du, Y.; Song, W.; He, Q.; Huang, D.; Liotta, A.; Su, C. Deep learning with multi-scale feature fusion in remote sensing for automatic oceanic eddy detection. Inform. Fusion 2019, 49, 89–99. [Google Scholar] [CrossRef]
  12. Jeon, H.; Kim, J.; Vadivel, S.K.P.; Kim, D.-J. A study on classifying sea ice of the summer arctic ocean using sentinel-1 A/B SAR data and deep learning models. Korean J. Remote Sens. 2019, 35, 999–1009. [Google Scholar] [CrossRef]
  13. Krestenitis, M.; Orfanidis, G.; Ioannidis, K.; Avgerinakis, K.; Vrochidis, S.; Kompatsiaris, I. Oil Spill Identification from Satellite Images Using Deep Neural Networks. Remote Sens. 2019, 11, 1762. [Google Scholar] [CrossRef]
  14. Zi, N.; Li, X.-M.; Gade, M.; Fu, H.; Min, S. Ocean eddy detection based on YOLO deep learning algorithm by synthetic aperture radar data. Remote Sens. Environ. 2024, 307, 114139. [Google Scholar] [CrossRef]
  15. Gao, Y.; Gao, F.; Dong, J.; Wang, S. Transferred Deep Learning for Sea Ice Change Detection From Synthetic-Aperture Radar Images. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1655–1659. [Google Scholar] [CrossRef]
  16. Hasimoto-Beltran, R.; Canul-Ku, M.; Mendez, G.M.D.; Ocampo-Torres, F.J.; Esquivel-Trava, B. Ocean oil spill detection from SAR images based on multi-channel deep learning semantic segmentation. Mar. Pollut. Bull. 2023, 188, 114651. [Google Scholar] [CrossRef]
  17. Zheng, Y.; Zhang, H.; Qi, K.; Ding, L.-Y. Stripe segmentation of oceanic internal waves in SAR images based on SegNet. Geocarto Int. 2022, 37, 8567–8578. [Google Scholar] [CrossRef]
  18. Wang, C.; Mouche, A.; Tandeo, P.; Stopa, J.E.; Longépé, N.; Erhard, G.; Foster, R.C.; Vandemark, D.; Chapron, B. A labelled ocean SAR imagery dataset of ten geophysical phenomena from Sentinel-1 wave mode. Geosci. Data J. 2019, 6, 105–115. [Google Scholar] [CrossRef]
  19. Colin, A.; Fablet, R.; Tandeo, P.; Husson, R.; Peureux, C.; Longépé, N.; Mouche, A. Semantic Segmentation of Metoceanic Processes Using SAR Observations and Deep Learning. Remote Sens. 2022, 14, 851. [Google Scholar] [CrossRef]
  20. Santos-Ferreira, A.M.; Da Silva, J.C.B.; Magalhaes, J.M. SAR Mode Altimetry Observations of Internal Solitary Waves in the Tropical Ocean Part 1: Case Studies. Remote Sens. 2018, 10, 644. [Google Scholar] [CrossRef]
  21. Ji, Y.; Xu, G.; Dong, C.; Yang, J.; Xia, C. Submesoscale eddies in the East China Sea detected from SAR images. Acta Oceanol. Sin. 2021, 40, 18–26. [Google Scholar] [CrossRef]
  22. Tao, M.; Xu, C.; Guo, L.; Wang, X.; Xu, Y. An Internal Waves Data Set From Sentinel-1 Synthetic Aperture Radar467Imagery and Preliminary Detection. Earth Space Sci. 2022, 9, e2022EA002528. [Google Scholar] [CrossRef]
  23. Wang, C.; Tandeo, P.; Mouche, A.; Stopa, J.E.; Gressani, V.; Longepe, N.; Vandemark, D.; Foster, R.C.; Chapron, B. Classification of the global Sentinel-1 SAR vignettes for ocean surface process studies. Remote Sens. Environ. 2019, 234, 111457. [Google Scholar] [CrossRef]
  24. Benchaabane, A.; Peureux, C.; Soulat, F. A Labelled Dataset Description for SAR Images Segmentation; College Localisation Satellites: Ramonville Saint-Agne, France, 2022; pp. 1–13. [Google Scholar]
  25. Gade, M.; Byfield, V.; Ermakov, S.; Lavrova, O.; Mitnik, L. Slicks as indicators for marine processes. Oceanography 2013, 26, 138–149. [Google Scholar] [CrossRef]
  26. Kozlov, I.E.; Artamonova, A.V.; Manucharyan, G.E.; Kubryakov, A.A. Eddies in the Western Arctic Ocean From Spaceborne SAR Observations Over Open Ocean and Marginal Ice Zones. J. Geophys. Res. Ocean 2019, 124, 6601–6616. [Google Scholar] [CrossRef]
  27. Stuhlmacher, A.; Gade, M. Statistical analyses of eddies in the Western Mediterranean Sea based on Synthetic Aperture Radar imagery. Remote Sens. Environ. 2020, 250, 112023. [Google Scholar] [CrossRef]
  28. Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 6–14 December 2021. [Google Scholar]
  29. Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
  30. Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
  31. Wang, P.; Chen, P.; Yuan, Y.; Liu, D.; Huang, Z.; Hou, X.; Cottrell, G. Understanding convolution for semantic segmentation. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018. [Google Scholar]
  32. Hou, Q.; Zhang, L.; Cheng, M.-M.; Feng, J. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
  33. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015. [Google Scholar]
  34. Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
  35. Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.; et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. In Proceedings of the Computer Vision and Pattern Recognition, Kuala Lumpur, Singapore, 18 December 2021. [Google Scholar]
  36. Catto, J.L.; Nicholls, N.; Jakob, C.; Shelton, K.L. Atmospheric fronts in current and future climates. Geophys. Res. Lett. 2014, 41, 7642–7650. [Google Scholar] [CrossRef]
  37. Figshare. Available online: https://figshare.com/articles/dataset/IWs_Dataset_v1_0/21365835 (accessed on 26 November 2025).
  38. Pradhan, R.K.; Markonis, Y.; Godoy, M.R.V.; Villalba-Pradas, A.; Andreadis, K.M.; Nikolopoulos, E.I.; Papalexiou, S.M.; Rahim, A.; Tapiador, F.J.; Hanel, M. Review of GPM IMERG performance: A global perspective. Remote Sens. Environ. 2022, 268, 112754. [Google Scholar] [CrossRef]
  39. Quankun, L.; Xue, B.; Xupu, G. A Dataset for Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena from Sentinel-1 Images; Zenodo: Geneva, Switzerland, 2024. [Google Scholar] [CrossRef]
Figure 1. 12 Oceanic and atmospheric phenomena we focus on (a) Atmospheric Fronts (AF); (b) Oceanic Fronts (OF); (c) Rainfall (RF); (d) Icebergs (IB); (e) Sea Ice (SI); (f) Pure Ocean Waves (POW); (g) Wind Streaks (WS); (h) Low Wind Areas (LWA); (i) Biological Slicks (BS); (j) Micro Convective Cells (MCC); (k) Ocean Internal Waves (IWs); (l) Ocean Eddies (Eddy).
Figure 1. 12 Oceanic and atmospheric phenomena we focus on (a) Atmospheric Fronts (AF); (b) Oceanic Fronts (OF); (c) Rainfall (RF); (d) Icebergs (IB); (e) Sea Ice (SI); (f) Pure Ocean Waves (POW); (g) Wind Streaks (WS); (h) Low Wind Areas (LWA); (i) Biological Slicks (BS); (j) Micro Convective Cells (MCC); (k) Ocean Internal Waves (IWs); (l) Ocean Eddies (Eddy).
Remotesensing 18 00113 g001
Figure 2. Sentinel-1 IW mode images preprocessing method.
Figure 2. Sentinel-1 IW mode images preprocessing method.
Remotesensing 18 00113 g002
Figure 3. The data distribution (red is WV mode, blue is IW mode) and the number of images for each category.
Figure 3. The data distribution (red is WV mode, blue is IW mode) and the number of images for each category.
Remotesensing 18 00113 g003
Figure 4. SAR images and labels examples (a,c) SAR Images; (b,d) Image labels.
Figure 4. SAR images and labels examples (a,c) SAR Images; (b,d) Image labels.
Remotesensing 18 00113 g004
Figure 5. The architecture of Segformer-OcnP.
Figure 5. The architecture of Segformer-OcnP.
Remotesensing 18 00113 g005
Figure 6. Visualization of segmentation results of five models (a) SAR Images; (b) Ground truth; (c) U-Net; (d) DeepLabV3+; (e) SETR; (f) Segformer; (g) Segformer-OcnP.
Figure 6. Visualization of segmentation results of five models (a) SAR Images; (b) Ground truth; (c) U-Net; (d) DeepLabV3+; (e) SETR; (f) Segformer; (g) Segformer-OcnP.
Remotesensing 18 00113 g006
Figure 7. Preprocessed Sentinel-1 IW mode image. (a) SAR image (b,c) Sub-images containing Eddy (acquisition date: 27 November 2022, UTC: 00:55).
Figure 7. Preprocessed Sentinel-1 IW mode image. (a) SAR image (b,c) Sub-images containing Eddy (acquisition date: 27 November 2022, UTC: 00:55).
Remotesensing 18 00113 g007
Figure 8. Segmentation result display. The segmentation results are displayed overlaid with the original image. (a) The segmentation result of entire SAR image (b,c) The segmentation result of sub-images containing Eddy.
Figure 8. Segmentation result display. The segmentation results are displayed overlaid with the original image. (a) The segmentation result of entire SAR image (b,c) The segmentation result of sub-images containing Eddy.
Remotesensing 18 00113 g008
Figure 9. Sentinel-1 WV mode image segmentation Results. The segmentation results are displayed overlaid with the original image. (a,c) SAR Images; (b,d) segmentation results.
Figure 9. Sentinel-1 WV mode image segmentation Results. The segmentation results are displayed overlaid with the original image. (a,c) SAR Images; (b,d) segmentation results.
Remotesensing 18 00113 g009
Figure 10. Internal wave object detection data example by Tao et al. (a) Case 1 (acquisition date: 12 July 2019, UTC: 21:56); (b) Case 2 (acquisition date: 4 September 2020, UTC: 21:42).
Figure 10. Internal wave object detection data example by Tao et al. (a) Case 1 (acquisition date: 12 July 2019, UTC: 21:56); (b) Case 2 (acquisition date: 4 September 2020, UTC: 21:42).
Remotesensing 18 00113 g010
Figure 11. Re-preprocessing result image (a) Case 1; (b) Case 2.
Figure 11. Re-preprocessing result image (a) Case 1; (b) Case 2.
Remotesensing 18 00113 g011
Figure 12. Segmentation result display. The segmentation results for ocean internal waves are overlaid on the original images for visualization. (a) Case 1; (b) Case 2.
Figure 12. Segmentation result display. The segmentation results for ocean internal waves are overlaid on the original images for visualization. (a) Case 1; (b) Case 2.
Remotesensing 18 00113 g012
Figure 13. Preprocessed result image (a) Case 1 (acquisition date: 29 October 2022, UTC: 04:41); (b) Case 2 (acquisition date: 30 October 2022, UTC: 11:17).
Figure 13. Preprocessed result image (a) Case 1 (acquisition date: 29 October 2022, UTC: 04:41); (b) Case 2 (acquisition date: 30 October 2022, UTC: 11:17).
Remotesensing 18 00113 g013
Figure 14. Segmentation result display. The segmentation results for rainfall are overlaid on the original images for visualization. (a) Case 1; (b) Case 2.
Figure 14. Segmentation result display. The segmentation results for rainfall are overlaid on the original images for visualization. (a) Case 1; (b) Case 2.
Remotesensing 18 00113 g014
Figure 15. Comparison of the segmentation result image of Case 1 and GPM data (a) Segmentation result; (b) GPM data.
Figure 15. Comparison of the segmentation result image of Case 1 and GPM data (a) Segmentation result; (b) GPM data.
Remotesensing 18 00113 g015
Figure 16. Comparison of the segmentation result image of Case 2 and GPM data (a) Segmentation result; (b) GPM data.
Figure 16. Comparison of the segmentation result image of Case 2 and GPM data (a) Segmentation result; (b) GPM data.
Remotesensing 18 00113 g016
Table 1. Ablation experiment results. The best results are shown in black bold.
Table 1. Ablation experiment results. The best results are shown in black bold.
ExperimentmDice (%)mIoU (%)OA (%)
Exp0: Segformer (Cross-entropy Loss)78.0267.1785.26
Exp1: Segformer (Combined Loss)78.8368.0885.20
Exp2: Exp1 + ASPP 79.4668.5985.39
Exp3: Exp2 + MPM79.9269.1885.45
Exp4: Exp3 + CA80.3169.7186.41
Exp5: Exp4 + Progressive Upsampling80.9870.3286.77
Table 2. Comparison of segmentation results of five models. (Does not include background class).
Table 2. Comparison of segmentation results of five models. (Does not include background class).
ModelmDice (%)mIoU (%)OA (%)
U-Net [33]72.3159.2979.07
DeepLabV3+ [34]78.8168.0484.93
SETR [35]78.2167.5084.81
Segformer [28]78.8368.0885.20
Segformer-OcnP (Ours)80.9870.3286.77
Table 3. Dice coefficients (%) of segmentation of different phenomena by five models, the best results are shown in black bold.
Table 3. Dice coefficients (%) of segmentation of different phenomena by five models, the best results are shown in black bold.
ModelAFOFRFICSIPOW
U-Net49.3557.0674.8446.1095.0279.9
DeepLabV3+61.1764.8184.1744.0199.3184.33
SETR63.1863.8687.1734.9399.2783.05
Segformer61.7363.6287.5839.4699.8585.65
Segformer-OcnP63.1167.0287.7948.9999.8786.47
ModelWSLWABSMCCIWsEddy
U-Net86.6585.6185.3280.2582.4245.17
DeepLabV3+92.6989.1190.6784.6886.9963.87
SETR91.7289.9990.7884.7184.6465.18
Segformer91.4489.5490.6084.5086.1065.92
Segformer-OcnP94.0889.9090.6184.6787.0872.20
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Q.; Bai, X.; Hu, L.; Li, L.; Bao, Y.; Geng, X.; Yan, X.-H. Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena in SAR Images Based on Modified Segformer. Remote Sens. 2026, 18, 113. https://doi.org/10.3390/rs18010113

AMA Style

Li Q, Bai X, Hu L, Li L, Bao Y, Geng X, Yan X-H. Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena in SAR Images Based on Modified Segformer. Remote Sensing. 2026; 18(1):113. https://doi.org/10.3390/rs18010113

Chicago/Turabian Style

Li, Quankun, Xue Bai, Lizhen Hu, Liangsheng Li, Yaohui Bao, Xupu Geng, and Xiao-Hai Yan. 2026. "Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena in SAR Images Based on Modified Segformer" Remote Sensing 18, no. 1: 113. https://doi.org/10.3390/rs18010113

APA Style

Li, Q., Bai, X., Hu, L., Li, L., Bao, Y., Geng, X., & Yan, X.-H. (2026). Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena in SAR Images Based on Modified Segformer. Remote Sensing, 18(1), 113. https://doi.org/10.3390/rs18010113

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop