Next Article in Journal
Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
Previous Article in Journal
An Iterative Error Correction Procedure for Single Sheet Testers Using FEM 3D Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

LGFUNet: A Water Extraction Network in SAR Images Based on Multiscale Local Features with Global Information

1
Chinese Academy of Surveying and Mapping, Beijing 100036, China
2
Key Laboratory of Natural Resources Monitoring and Supervision in Southern Hilly Region, Ministry of Natural Resources, Changsha 410118, Hunan, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(12), 3814; https://doi.org/10.3390/s25123814
Submission received: 9 May 2025 / Revised: 14 June 2025 / Accepted: 16 June 2025 / Published: 18 June 2025
(This article belongs to the Section Remote Sensors)

Abstract

:
To address existing issues in water extraction from SAR images based on deep learning, such as confusion between mountain shadows and water bodies and difficulty in extracting complex boundary details for continuous water bodies, the LGFUNet model is proposed. The LGFUNet model consists of three parts: the encoder–decoder, the DECASPP module, and the LGFF module. In the encoder–decoder, the Swin-Transformer module is used instead of convolution kernels for feature extraction, enhancing the learning of global information and improving the model’s ability to capture the spatial features of continuous water bodies. The DECASPP module is employed to extract and select multiscale features, focusing on complex water body boundary details. Additionally, a series of LGFF modules are inserted between the encoder and decoder to reduce the semantic gap between the encoder and decoder feature maps and the spatial information loss caused by the encoder’s downsampling process, improving the model’s ability to learn detailed information. Sentinel-1 SAR data from the Qinghai–Tibet Plateau region are selected, and the water extraction performance of the proposed LGFUNet model is compared with that of existing methods such as U-Net, Swin-UNet, and SCUNet++. The results show that the LGFUNet model achieves the best performance, respectively.

1. Introduction

Surface water resources are crucial water resources on Earth, supporting indispensable functions related to human production, daily life, and the material and energy cycles [1,2,3,4,5]. Satellite remote sensing technology has been widely applied in surface water detection because of its advantages of a fast monitoring speed, strong timeliness, and ability to obtain large-scale repetitive observations. Compared with optical remote sensing, synthetic aperture radar (SAR), as an active microwave system, can penetrate clouds, rain, and fog and is unaffected by lighting conditions, enabling all-weather, day–night target sensing. SAR systems can obtain information about objects by actively emitting radar wave signals and receiving backscattered echo signals from the Earth’s surface. When the radar signal emitted by a SAR system encounters a water surface, it typically exhibits specular reflection, with only a small portion of the signal returning to the receiving system. In contrast, rough surface objects return stronger backscattered signals via mechanisms such as diffuse reflection, secondary scattering, or volume scattering. Therefore, by utilizing the differences in backscatter values between water and non-water objects in SAR images, it is possible to distinguish water bodies effectively. This characteristic makes SAR an invaluable tool for water extraction, with enormous potential [6].
There are two main types of methods for extracting water from SAR images: traditional methods and deep learning methods [4]. Traditional methods primarily include thresholding [7,8,9], clustering analysis [10,11,12,13,14], Markov random field (MRF) [15], and machine learning approaches [16]. However, these methods are often influenced by the heterogeneity of the observation environment and the inherent speckle noise in SAR images [17], making it difficult to meet the current demand for rapid, large-scale surface water extraction. Machine learning methods, represented by support vector machines (SVMs) and random forests (RFs) [18,19], can learn the differences in the patterns of water and non-water bodies across multiple feature dimensions through training datasets, thus improving the water extraction accuracy to some extent. However, most of the features in these methods rely on manual design, and their feature representation capabilities are limited, making it difficult to ensure that they can fully capture the distinctions between water bodies and other land features. In contrast, deep learning techniques possess powerful feature learning capabilities and do not require manual feature design [20], providing a new approach for water extraction from SAR images.
Over the past few decades, deep learning has undergone rapid advancements and has been extensively applied in various SAR remote sensing tasks, such as image classification [21,22,23,24], change detection [25,26], and target recognition [27,28,29,30,31]. In 2015, Jonathan Long proposed fully convolutional networks (FCNs) [32], which replaced the fully connected layers of traditional CNNs with convolutional layers, marking the first application of deep learning in the field of semantic segmentation. Kang et al. used a FCN for flood monitoring and mapping with Gaofen-3 SAR imagery [33], but the FCN model failed to adequately consider the global spatial relationships among pixels [34], resulting in poor fine-scale segmentation. Ronneberger et al. enhanced the architecture of FCNs by increasing the number of decoders and incorporating skip connections to link decoder and encoder features, leading to the development of a U-Net network with an encoder–decoder structure [35]. Wang et al. compared the water extraction accuracies of three methods—the Otsu method, the object-oriented method, and U-Net—using Sentinel-1 SAR imagery and demonstrated the advantages of the U-Net model for water extraction [36]. However, owing to the similar backscattering characteristics of water bodies and land features, such as mountain shadows in SAR images [25], there can be confusion between the two during water extraction, especially in mountainous regions. Although the U-Net approach has improved water extraction compared with traditional methods, it still struggles to effectively address issues such as confusion between water bodies and shadows, and the accurate extraction of complex boundary details for continuous water bodies may be limited.
Owing to its robust generalization capabilities supported by the U-shaped encoder–decoder architecture, U-Net has emerged as one of the predominant networks for semantic segmentation and water body extraction from SAR images [5]. Researchers have proposed numerous enhanced network models that are based on the U-Net framework. For example, H. Song et al. proposed an improved U-Net network based on a hybrid attention mechanism (HA-UNet) for urban water extraction [37]. By conducting experiments with Sentinel-1A SAR images, they demonstrated that the proposed method significantly improved the accuracy of water body extraction in urban areas. Chuan Xu et al. developed a flood detection method for SAR images by integrating attention U-Net with a multiscale level set method and applied it for the dynamic monitoring of flood disasters in Jiangxi, Anhui, and Chongqing in 2020 [38]. However, this method still faces challenges in terms of water extraction in mountainous areas. Wang introduced dilated convolution and the spatial channel squeeze and excitation (SCSE) attention mechanism to U-Net [39], proposing a floodwater extraction network (FWENet) based on SAR images [40]. This model improved the extraction of small water bodies and water body boundaries, but the distinction between water bodies and shadows remained suboptimal. However, existing studies that utilize deep learning for SAR image-based water extraction rely primarily on convolutional kernels as the core for feature extraction. Convolution kernels are limited by their receptive fields [4], hindering their ability to capture global information effectively from SAR images. Challenges such as confusion between shadows and water bodies, as well as difficulties in extracting the detailed boundaries of continuous water bodies, persist.
To address the aforementioned issues, a novel end–end network framework called the local and global feature fusion UNet (LGFUNet) model is developed in this study and applied for water extraction in the complex terrain of the Qinghai–Tibet Plateau to demonstrate the advantages of the proposed method. The main contributions of this paper are as follows:
(1)
A multiscale feature learning module (DECASPP) is proposed. Without increasing the number of model layers, DECASPP enables the model to learn important multiscale water body features, thereby enhancing the model’s ability to distinguish between shadows and water bodies.
(2)
To address the issue of the misdetection of small water bodies, the local and global feature fusion (LGFF) module is introduced; it integrates the global and local features of water bodies, improving the model’s ability to extract detailed information about small water bodies.
(3)
To address the challenge of extracting complex boundary details for continuous water bodies, the LGFUNet water extraction network model is established by combining the Swin-Transformer [41], DECASPP, and LGFF modules. This model comprehensively learns both the global and local features of water bodies at different scales, enhancing its ability to extract large-scale water bodies while effectively preserving the boundary details of spatially continuous and complex water bodies.

2. Methods

2.1. Overall Structure of the Proposed LGFUNet Model

The overall architecture of the proposed LGFUNet model is shown in Figure 1. The model consists of three main modules: the encoder–decoder module, the DECASPP module, and the LGFF module. The LGFUNet model begins by transforming the input image into patch tokens of size H/4 × W/4 and dimension C through a patch partitioning layer and a linear embedding layer. These patch tokens are then passed through the Swin-Transformer block to learn the global information associated with continuous water bodies. The features are subsequently downsampled through the patch merging layer, and the Swin-Transformer block is utilized to learn global water body information at different scales, sequentially obtaining multiscale image features with dimensions H/4 × W/4 × C, H/8 × W/8 × 2C, H/16 × W/16 × 4C and H/32 × W/32 × 8C. At the end of the encoder stage, the features are passed to the DECASPP module, which further captures water body information at various scales without downsampling, providing valuable multiscale features for the subsequent encoder. During the decoder phase, the model employs the patch expanding layer to upsample the features, gradually restoring the feature size until it matches the input image dimensions. However, upsampling alone cannot fully recover the complete image features. To prevent information loss, the model incorporates a series of LGFF modules between the decoder and encoder. These modules autonomously filter detailed water body features while implementing skip connections for the corresponding layers in the decoder. Furthermore, the Swin-Transformer block is used to fuse the upsampled feature information with the salient water body features transmitted by the LGFF, thus integrating global and local information at different scales. Finally, the multiscale feature image generated by the decoder, which matches the input image size (H × W), undergoes linear projection to obtain pixel-level water body extraction results.

2.2. Swin-Transformer Block

As illustrated in Figure 2, the Swin-Transformer block consists of two consecutive structures, each comprising two LayerNorms (LNs), a multilayer perceptron (MLP) with GELU nonlinearity, a residual connection, and a multihead self-attention (MSA) module. The MSA modules in these two structures utilize window-based MSA (W-MSA) and shift window-based MSA (SW-MSA). Compared with convolutional architectures, the Swin-Transformer block enhances the model’s ability to capture long-range dependencies and learn the spatially continuous features of large-scale water bodies by implementing window-shifting operations through SW-MSA, which establishes connections across different self-attention windows.

2.3. DECASPP

Owing to the limitations of the Swin-Transformer block in extracting local image features, this paper integrates efficient channel attention (ECA) [42], depthwise separable convolution [43], and atrous spatial pyramid pooling [44] to construct the DECASPP module. Comprising six parallel branches, as illustrated in Figure 3, DECASPP includes four parallel attention pooling branches formed by depthwise separable convolution (DSConv) with different dilation rates, a 1 × 1 convolution mapping branch combined with ECA, and a global average pooling (GAP) branch. The input feature dimension of the module is 8C. GAP is responsible for downsampling the features to prevent overfitting in this layer. The four depthwise separable convolution branches with different dilation rates and the 1 × 1 convolution mapping branch are designed to capture contextual information for water bodies from various receptive fields. The depthwise separable convolution steps reduce the number of parameters while maintaining model performance, improving model efficiency. Additionally, the ECA mechanism is applied to filter multiscale features in the channel dimension, further enhancing the multiscale extraction effect. This helps provide rich multiscale features for the subsequent decoder and LGFF module. In the DECASPP module, the feature dimension and resolution remain unchanged.
ECA [42] can capture inter-channel dependencies without the need for dimensionality reduction or expansion. First, ECA performs global average pooling on the input features, and then it executes a size-k fast one-dimensional convolution based on each channel and its k neighboring channels to generate channel weights, thereby capturing the dependencies between channels. The value of k is adaptively determined through the channel dimension, as shown in the following formula:
k = ѱ ( C ) = log 2 ( C ) y + b y o d d
In the expression, |t|odd denotes the nearest odd number to t, with b set to 1 and y set to 2; C represents the number of input channels.
For convolutional operations with predetermined input and output feature map dimensions as well as kernel sizes, the parameter counts for standard convolution and depthwise separable convolution are expressed as follows:
S c o n v = M × ( D K × D K × N )
S D S C = M × ( D K × D K + N )
where M denotes the number of input feature channels, N represents the number of output feature channels, and the convolution kernel size is DK × DK. Sconv and SDSC correspond to the parameter counts of standard convolution and depthwise separable convolution, respectively, under the assumption of no bias terms. The effectiveness of depthwise separable convolution in reducing parameters becomes more pronounced as the number of output feature channels increases.

2.4. LGFF

The conventional skip connections in the U-Net architecture may inadequately bridge the semantic gap between the features of the encoder and decoder. To address this issue, we integrate a series of LGFF modules between the encoder and decoder, as illustrated in Figure 4. The LGFF module employs convolutional operations to extract local features from the water body information conveyed by the encoder and the DECASPP module, thereby compensating for the limitations of the Swin-Transformer structure in capturing detailed water body features and providing additional information to the decoder. The ECA mechanism is used to selectively filter the global features extracted by the encoder, and the refined global features are then combined with the local features extracted by the LGFF. This combined feature set is transmitted to the decoder to diminish the semantic disparity between the encoder and decoder, ameliorate the loss of spatial information during the encoder’s patch merging and downsampling processes, and enhance the model’s ability to learn intricate details of water bodies.

3. Experiment

3.1. Dataset

In this study, the Qinghai–Tibet Plateau is the experimental area used to verify the advantages of the proposed algorithm. The plateau is dotted with numerous water bodies of varying sizes, including China’s largest inland saltwater lake, Qinghai Lake, as well as countless smaller lakes (with areas of less than 1 km2). Additionally, the Qinghai–Tibet Plateau is characterized by dense mountains and complex terrain. Water body extraction research in this region requires not only the effective differentiation of mountain shadows and water bodies but also the identification of small water bodies and the complex boundaries of spatially continuous water bodies. Seven Sentinel-1 SAR images from the Qinghai–Tibet Plateau region are used to construct a water body dataset, with the basic information for the data sources presented in Table 1. The original data underwent preprocessing steps such as radiometric correction and geocoding, resulting in orthorectified images that each pixel represents a ground extent of 14 × 14 meters meters. These images, combined with a manual visual interpretation of Google optical imagery, were used to delineate ground truth water body labels. For training and validation purposes, the water body labels and the seven SAR images were divided into 3083 sample patches of size 512 × 512. Among these, 2573 image samples were randomly selected as the training dataset, 201 image samples were selected as the test dataset, and the remaining samples were selected as the validation dataset. The study area and dataset examples are illustrated in Figure 5. In the labels, the black and white regions denote non-water and water bodies, respectively. The location distribution of the test dataset is shown in Figure 6.
The network proposed in this study was constructed and developed via PyTorch 1.12.1, CUDA 11.3, and Python 3.10. It was trained on an Nvidia Quadro Rtx 5000 with 16 GB of GPU memory. Given the GPU memory constraints, the batch size was set to 4. The Adam optimization algorithm was employed to update the model gradients, with an initial learning rate of 0.001.

3.2. Comparison of Overall Performance Among Different Models

To facilitate a quantitative analysis of model performance, this study employs a confusion matrix derived from the comparison between predicted images and ground truth labels, as illustrated in Table 2. The four fundamental metrics within the confusion matrix are defined as follows:
True Positive (TP): The number of water pixels correctly classified as water.
True Negative(TN): The number of non-water pixels correctly classified as non-water.
False Positive (FP): The number of non-water pixels erroneously classified as water.
False Negative(FN): The number of water pixels erroneously classified as non-water.
To validate the effectiveness of the proposed model, we conducted comparative experiments between the LGFUNet model and several state-of-the-art networks, namely, U-Net, Swin-UNet [45], and SCUNet++ [46]. To comprehensively evaluate the performance of the LGFUNet model in water body extraction, we employed the following indicators: overall accuracy (OA), precision, recall, F1-score, intersection over union (IoU), and kappa [37]. OA represents the ratio of correctly classified pixels to the total number of pixels. Precision is the ratio of correctly predicted water pixels to all predicted water pixels. Recall is the ratio of correctly predicted water pixels to the actual number of water pixels. The F1-score balances precision and recall. The IoU is the ratio of the intersection to the union of the predicted water regions and the true water regions. Kappa is used to assess the degree of agreement between the water extraction results and the actual ground reference data. The definitions of these indicators are shown in Equations (4)–(9) as follows:
O A = T P + T N T P + F N + F P + T N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 S c o r e s = 2 T P 2 T P + F N + F P
I o U = T P T P + F P + F N
K a p p a = O A P e 1 P e , P e = T P + F N T P + F P + ( F P + T N ) ( F N + T N ) ( T P + T N + F P + F N ) 2
The quantitative accuracy evaluation results for the LGFUNet model compared with the other models for the test dataset are presented in Table 3. Compared with the other methods, the proposed LGFUNet model demonstrates superior water extraction accuracy overall, with significant improvements across all the indicators. Specifically, the LGFUNet model achieves an OA of 99.31%, a precision of 96.31%, a recall of 92.50%, an F1-score of 94.04%, an IoU of 89.48%, and a kappa coefficient of 93.40%. Compared with the Swin-UNet model, the LGFUNet model yields improvements of 0.92%, 1.07%, 1.60%, 1.52%, 2.45%, and 1.78% across the respective indicators. These results indicate that the LGFUNet model can effectively identify and extract surface water bodies on the Qinghai–Tibet Plateau.
To provide an accurate and intuitive comparison of the results of these methods, we selected several typical areas within the study area for model performance comparisons. This comparison aimed to verify the advantages of the method proposed in this paper in terms of small lake water body extraction, the distinction between shadows and water bodies, and the preservation of complex boundary details for continuous water bodies.

3.3. Performance Comparison for the Identification of Small Lakes

The extraction of small lakes is crucial for water body extraction tasks. In this study, we compared the LGFUNet model with three other models in small water body areas, and the results are illustrated in Figure 7. The visualization demonstrates that the proposed LGFUNet model excels in identifying and extracting small lake water bodies from SAR images, with the overall performance being superior to that of the other models. While U-Net achieves high precision, it tends to be overly conservative in extracting small water bodies, failing to capture all water bodies effectively, with significant omissions particularly evident in Figure 7(c(1),c(2),c(4)). As shown in Figure 7(d(1),d(2)), Swin-UNet displays a comparatively worse ability to extract local features, resulting in lower precision for small water bodies. SCUNet++ performs slightly better than Swin-UNet in extracting small water bodies but still falls short of the LGFUNet model in terms of its overall performance. The visual results highlight that the LGFUNet model significantly improves the extraction of small water bodies compared with the other models, although some omissions are observed at the blurred boundaries of small water bodies in Figure 7(f(3)). A quantitative evaluation of the prediction results on the basis of the images in Figure 7 is presented in Table 4. Compared with the other models, the LGFUNet model performs best in small water body extraction. Specifically, compared with the Swin-UNet model, the LGFUNet model achieves improvements of 0.12%, 2.19%, 1.82%, 2.30%, 3.16%, and 2.37% for OA, recall, F1-score, IoU, and kappa, respectively.

3.4. Performance Comparison for Water Body Identification Within Shadowed Areas

Shadows are among the most significant challenges when extracting water bodies from SAR images. In SAR images, shadows exhibit backscattering characteristics similar to those of water bodies (both appear as dark areas), leading to confusion between the two during water extraction, particularly in mountainous regions. Figure 8 presents the water extraction results of the LGFUNet model compared with those of the other models in shadow areas. The U-Net model demonstrates a poor ability to distinguish between water bodies and shadows, as is particularly evident in Figure 8(c(3),c(4)). In Figure 8(e(1),e(4)), the SCUNet++ model fails to differentiate between water bodies and shadows, resulting in significant misclassification. The visualization results show that, compared with the other models, the LGFUNet model performs best in identifying water bodies in shadow areas, with a noticeable reduction in misclassifying shadows as water. However, some omissions of small water bodies in shadow areas are still observed in Figure 8(f(1)). A quantitative evaluation of the prediction results based on the images in Figure 8 is presented in Table 5, and the findings indicate that the LGFUNet model outperforms the other models across all indicators.

3.5. Performance Comparison for Identifying Complex Spatially Continuous Water Bodies

SAR acquires information about ground objects by capturing reflected radar wave signals, which results in similar backscattering characteristics between water bodies and surrounding low-reflectance objects in SAR images. This makes it challenging to distinguish spatially continuous water bodies with complex boundaries from transition areas that exhibit similar backscattering characteristics. Accurately extracting the detailed boundaries of spatially continuous water bodies that resemble surrounding ground features is a significant challenge. The extraction results of the LGFUNet model and other models for this type of water body are illustrated in Figure 9.
In Figure 9, for the water body region in Figure 9(a(3)), with complex spatial continuity and features that resemble those of surrounding ground objects, both U-Net and Swin-UNet struggle to distinguish the water body from other ground objects, demonstrating poor model performance. In Figure 9(a(1)), Swin-UNet and SCUNet++ exhibit inadequate performance in extracting water bodies from areas with similar surrounding ground features. Additionally, in Figure 9(d(2),d(4)), Swin-UNet performs poorly in extracting the detailed boundaries of spatially continuous water bodies. In contrast, the LGFUNet model demonstrates superior performance across Figure 9(f(1)–(4)). Notably, in Figure 9(f(3)), the LGFUNet model yields a significant improvement over the other models, although it still fails to fully extract small river branches with complex spatial continuity for water body boundaries. A quantitative evaluation of the prediction results based on the images in Figure 9 is presented in Table 6. Compared with the Swin-UNet model, the proposed LGFUNet model achieves improvements of 6.30%, 27.73%, 18.73%, 26.21%, and 21.72% in OA, recall, F1-score, IoU, and kappa, respectively.

4. Discussion

The proposed LGFUNet model in this study employs the Swin-Transformer as a feature extractor, incorporating several key innovations. First, a DECASPP module is constructed, which combines atrous spatial pyramid pooling, ECA, and depthwise separable convolution to enhance multiscale feature extraction and obtain valuable multiscale information. Second, a series of LGFF modules are introduced between the encoder and decoder. These modules integrate global features from the encoder, multiscale features extracted by DECASPP, and local features extracted by the LGFF modules, which are then passed to the decoder. This integration reduces the semantic gap between the encoder and decoder feature maps and mitigates spatial information loss caused by patch merging and downsampling, thereby improving the model’s ability to learn detailed information. As demonstrated by the visualization results and quantitative evaluations in Chapter 3, the LGFUNet model exhibits superior performance in extracting water bodies in complex regions, including small water bodies, spatially continuous water bodies, and shadow areas. The number of parameters for each model and the training time per epoch are shown in Table 7. The LGFUNet model has enhanced learning capabilities for water feature characteristics in SAR images, resulting in a slight increase in the number of parameters; however, it has a certain advantage in training time. Moreover, the LGFUNet model demonstrates the best performance across various metrics on the test dataset.
To further validate the effectiveness of the proposed improvements in the LGFUNet model, ablation experiments were conducted to assess the impact of the DECASPP and LGFF modules. The results of the ablation experiments are presented in Table 8, and the visualization results are shown in Figure 10. In Swin-UNet+DECASPP, the DECASPP module alone is employed to enhance multiscale feature extraction, effectively improving the accuracy of water body extraction. However, the improvements in the other metrics are relatively modest. In Figure 10(d(1),d(2)), this module significantly enhances Swin-UNet’s ability to extract detailed water body boundaries. In Swin-UNet+LGFF, only the LGFF module is used to reduce the semantic gap between the encoder and decoder feature maps, improving Swin-UNet’s ability to extract local features. All the evaluation metrics for the prediction results show improvement. In Figure 10(e(1),e(3)), the LGFF module enhances Swin-UNet’s ability to extract small water bodies, although the overall model performance remains inferior to that of the LGFUNet model. The experiments demonstrate that the LGFF module can effectively enhance the model’s water extraction performance. While the DECASPP module alone does not significantly improve model performance, it provides the LGFF module with valuable multiscale information. When the DECASPP module is combined with the LGFF module, the model achieves optimal performance.
Siling Co (located in Nagqu City, Tibet Autonomous Region) is one of China’s largest inland saltwater lakes. Research by Lei Yanbin et al. [47]. has revealed that Siling Co has become one of the most rapidly expanding lakes on the Qinghai–Tibet Plateau over the past two decades. Its rapid expansion has led to the inundation of surrounding grasslands and road damage, significantly impacting the local environment, wildlife, and human livelihoods. Therefore, this study selected Siling Co to validate the model’s generalization capability and explored the correlation between lake area and meteorological data based on the monitoring results.
This research utilizes Sentinel-1A images acquired at the beginning of each month from July to November 2018 for the Siling Co region. Combined with LGFUnet extraction, we investigated the short-term fluctuations in the lake’s monthly water extent. The monthly variations in the water surface area of Siling Co Lake and the corresponding average precipitation from July to November 2018 are presented in Figure 11.
The Siling Co region experiences its rainy season from July to September, characterized by substantial monthly precipitation. During this period, the surface area of Siling Co Lake exhibits a steady expansion, reaching its maximum extent at the end of the rainy season (early October). Following the conclusion of the rainy season, the lake gradually enters its freezing period. During freezing, precipitation and evaporation on the Qinghai–Tibet Plateau are both minimal, resulting in relatively stable Siling Co lake areas. Consequently, this study employs July Sentinel-1A data from 2017 to 2024 to monitor Siling Co’s interannual variations.
The interannual variations in the Siling Co Lake area and annual precipitation from 2017 to 2024 are presented in Figure 12. The lake area exhibited a minimum value of 2347.7 km2 in 2017 and reached its maximum of 2441.82 km2 in 2024, demonstrating a consistent expansion trend. Over the eight-year period, the lake area increased by 94.12 km2 in total, with an average annual growth rate of 0.5%. Precipitation showed an overall increasing trend with considerable interannual variability (coefficient of variation, CV = 29.92%), indicating significant year-to-year differences in rainfall.
To quantitatively assess the relationship between lake area and climatic factors, we performed Pearson correlation analysis. The results revealed a statistically significant positive correlation (r = 0.76, p = 0.029) between lake area and precipitation, suggesting a strong association (p < 0.05). For instance, the annual precipitation in 2021 (from July 2020 to July 2021) reached 1135 mm, corresponding to an area increase of 17.63 km2, while the lower precipitation in 2018 (490 mm) resulted in a smaller expansion of 6.93 km2. Precipitation during the previous year’s rainy season (July–September) is stored through surface runoff and winter ice formation, then gradually released during spring snowmelt, ultimately influencing the lake area measured in July.

5. Conclusions

In this study, a water extraction method based on SAR imagery, the LGFUNet model, is proposed. Sentinel-1A SAR images are used as the basis for extraction, and the model achieves promising extraction results in the study area on the Qinghai–Tibet Plateau. In the LGFUNet model, the Swin-Transformer module is employed to replace convolutional kernels for feature extraction, enhancing the learning of global features and improving the model’s ability to capture the spatial relationships associated with large, continuous water bodies. Within the DECASPP module, ECA and atrous spatial pyramid pooling are utilized to filter and refine multiscale features, thereby improving multiscale extraction performance. Additionally, a series of LGFF modules are introduced between the encoder and decoder. These modules integrate global information from the encoder, multiscale feature information from the DECASPP module, and local information extracted by the LGFF modules, which are then passed to the decoder. This integration reduces the semantic gap between the encoder and decoder feature maps, compensates for Swin-Transformer’s limitations in local feature extraction, mitigates spatial information loss during downsampling, and enhances the model’s ability to extract small water bodies. Both quantitative evaluation results and visualization results comparing the LGFUNet model with other models demonstrate that the LGFUNet model can accurately and effectively extract surface water resources on the Qinghai–Tibet Plateau, showing significant potential for water body extraction applications in this region.
In comparative experiments involving three challenging tasks—water extraction in shadowed areas, small lake extraction, and complex boundary extraction for continuous water bodies—the proposed LGFUNet model outperforms other end-to-end models, demonstrating superior performance. However, some limitations remain, such as occasional omissions of small water bodies with blurred boundaries and fine river branches with spatially continuous and complex boundaries. In future work, we aim to further refine the LGFUNet model, including but not limited to the following aspects: exploring more combinations of the Swin-Transformer and CNNs based on the LGFUNet architecture to optimize the model structure and enhance performance; expanding the dataset to improve the model’s generalization capabilities; and addressing the current limitations to achieve more robust and accurate water body extraction.

Author Contributions

Conceptualization, X.B. and Y.Z.; methodology, X.B., Y.Z. and J.W.; software, X.B.; validation, X.B., Y.Z. and J.W.; investigation, X.B. and J.W.; resources, Y.Z.; writing—original draft preparation, X.B.; writing—review and editing, X.B., Y.Z. and J.W.; visualization, X.B.; supervision, X.B. and Y.Z.; project administration, Y.Z. and J.W.; funding acquisition, Y.Z. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (grant 2023YFC3007202), the Open Fund of the Key Laboratory of Natural Resources Monitoring and Supervision in Southern Hilly Region, Ministry of Natural Resources (NRMSSHR2022Z09), and the Fundamental Research Funds from the Chinese Academy of Surveying and Mapping (AR2408).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yang, M.; Wu, G.; Zhu, S.; Li, S. Study on reciprocal relationship among water amount-water quality-water efficiency based on the SWAT_WAQER model—A case study of the Yulin catchment. In Proceedings of the 2021 7th International Conference on Hydraulic and Civil Engineering & Smart Water Conservancy and Intelligent Disaster Reduction Forum (ICHCE & SWIDR), Nanjing, China, 6–8 November 2021; pp. 247–254. [Google Scholar] [CrossRef]
  2. Wu, D.; Wang, H.; Mohammed, H.; Seidu, R. Quality Risk Analysis for Sustainable Smart Water Supply Using Data Perception. IEEE Trans. Sustain. Comput. 2019, 5, 377–388. [Google Scholar] [CrossRef]
  3. Chen, X.; Liu, L.; Zhang, X.; Li, J.; Wang, S.; Liu, D.; Duan, H.; Song, K. An Assessment of Water Color for Inland Water in China Using a Landsat 8-Derived Forel–Ule Index and the Google Earth Engine Platform. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5773–5785. [Google Scholar] [CrossRef]
  4. Jiang, W.; An, Z.; Jin, T.; Chen, P.; Zou, X. KF-MFWL: A High-Resolution Time Series Construction Algorithm for Lake Water Levels Based on Multisource Altimeter Satellites and Meteorological Data Fusion. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–12. [Google Scholar] [CrossRef]
  5. Wen, Q.; Li, L.; Xiong, L.; Du, L.; Liu, Q.; Wen, Q. A review of water body extraction research based on deep learning in remote sensing images. Remote Sens. Nat. Resour. 2024, 36, 57–71. [Google Scholar] [CrossRef]
  6. Gao, H.X.; Chen, B.; Sun, H.Q. Research progress and prospect of flood detection based on SAR satellite images. J. Geo-Inf. Sci. 2023, 25, 1933–1953. [Google Scholar] [CrossRef]
  7. Kavats, O.; Khramov, D.; Sergieieva, K.; Puputti, J.; Joutsenvaara, J.; Kotavaara, O. Optimal Threshold Selection for Water Bodies Mapping from Sentinel-L Images Based on Sentinel-2 Water Masks. In Proceedings of the IGARSS 2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 5551–5554. [Google Scholar] [CrossRef]
  8. Rahimi, Z.; Othman, F. Water extend estimation over vegetated terrains in coastal area using multitemporal remote sensing data: A case study in Tumpat, Malaysia. In Proceedings of the 7th Brunei International Conference on Engineering and Technology 2018 (BICET 2018), Bandar Seri Begawan, Brunei, 12–14 November 2018; pp. 1–4. [Google Scholar] [CrossRef]
  9. Wang, Z.; Zhang, R.; Zhang, Q.; Zhu, Y.; Huang, B.; Lu, Z. An Automatic Thresholding Method for Water Body Detection from SAR Image. In Proceedings of the 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 11–13 December 2019; pp. 1–4. [Google Scholar] [CrossRef]
  10. Meng, L.; Mao, X.; Wei, Z.; Zhang, W. Probabilistic water body mapping of GF-3 images based on prior probability estimation. Acta Geod. Cartogr. Sin. 2019, 48, 439–447. [Google Scholar] [CrossRef]
  11. Tang, L.; Liu, W.; Yang, D.; Chen, L.; Su, Y.; Xu, X. Flooding monitoring application based on the object-oriented method and Sentinel-1A SAR data. J. Geo-Inf. Sci. 2018, 20, 377–384. [Google Scholar] [CrossRef]
  12. Li, N.; Niu, S.; Guo, Z.; Wu, L.; Zhao, J.; Min, L.; Ge, D.; Chen, J. Dynamic Waterline Mapping of Inland Great Lakes Using Time-Series SAR Data from GF-3 and S-1A Satellites: A Case Study of DJK Reservoir, China. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2019, 12, 4297–4314. [Google Scholar] [CrossRef]
  13. Li, C.; Xue, D.; Zhang, L.; Su, L. Research on Water Extraction Method Based on Sentinel-1A Satellite SAR Data. Geospat. Inf. 2018, 16, 37–40. [Google Scholar] [CrossRef]
  14. Liu, Z.; Li, Y.; Bi, Y.; Song, S.; Niu, Y.; Zhao, J. A Novel Flood Monitoring Method Using Temporal Information and Statistical Characteristics in SAR Images. In Proceedings of the 2024 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2024), Athens, Greece, 7–12 July 2024. [Google Scholar]
  15. Tang, D.; Wang, F.; Wang, H. Single-polarization SAR data flood water detection method based on Markov segmentation. J. Electron. Inf. Technol. 2019, 41, 619–625. [Google Scholar] [CrossRef]
  16. Wang, L.; Lian, Z. Remote sensing monitoring of Poyang lake flood disaster in 2020 based on Sentinel-1A. Geospat. Inf. 2022, 20, 43–46. [Google Scholar] [CrossRef]
  17. Zhou, Y.; Dong, J. Review on monitoring open surface water body using remote sensing. J. Geo-Inf. Sci. 2019, 21, 1768–1778. [Google Scholar] [CrossRef]
  18. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  19. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  20. Chen, Y.; Tang, L.; Kan, Z.; Bilal, M.; Li, Q. A Novel Water Body Extraction Neural Network (WBE-NN) for Optical High-Resolution Multispectral Imagery. J. Hydrol. 2020, 588, 125092. [Google Scholar] [CrossRef]
  21. Li, N.; Guo, Z.; Zhao, J.; Wu, L.; Guo, Z. Characterizing Ancient Channel of the Yellow River from Spaceborne SAR: Case Study of Chinese Gaofen-3 Satellite. IEEE Geosci. Remote. Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
  22. Guo, Z.; Zhao, J.; Li, N.; Wu, L. An Adaptive Irregular Convolution U-Net for Reconstructing Ancient Channel of the Yellow River. In Proceedings of the 2021 IEEE Sensors, Sydney, Australia, 31 October–3 November 2021; pp. 1–4. [Google Scholar] [CrossRef]
  23. Ai, J.; Mao, Y.; Luo, Q.; Jia, L.; Xing, M. SAR Target Classification Using the Multikernel-Size Feature Fusion-Based Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–13. [Google Scholar] [CrossRef]
  24. Wang, K.; Ren, Z.; Hou, B.; Sha, F.; Wang, Z.; Li, W.; Jiao, L. Water-Matching CAM: A Novel Class Activation Map for Weakly-Supervised Semantic Segmentation of Water in SAR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 18, 3222–3235. [Google Scholar] [CrossRef]
  25. Xie, Y.; Zeng, H.; Yang, K.; Yuan, Q.; Yang, C. Water-Body Detection in Sentinel-1 SAR Images with DK-CO Network. Electronics 2023, 12, 3163. [Google Scholar] [CrossRef]
  26. Duan, Y.; Sun, K.; Li, W.; Wei, J.; Gao, S.; Tan, Y.; Zhou, W.; Liu, J.; Liu, J. WCMU-net: An Effective Method for Reducing the Impact of Speckle Noise in SAR Image Change Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 18, 2880–2892. [Google Scholar] [CrossRef]
  27. Yang, Y.; Zhu, W.; Li, J.A. Review of SAR Image Target Recognition Based on Deep Learning. Electron. Opt. Control 2022, 29, 58–62. Available online: https://link.cnki.net/urlid/41.1227.TN.20211105.1909.004 (accessed on 8 November 2021).
  28. Pillai, L.G.; Dolly, D.R.J. Flood detection using SAR images: A review. AIP Conf. Proc. 2024, 3059, 020001. [Google Scholar] [CrossRef]
  29. Dai, L.-Y.; Li, M.-D.; Chen, S.-W. PCCN: Polarimetric Contexture Convolutional Network for PolSAR Image Super-Resolution. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 4664–4679. [Google Scholar] [CrossRef]
  30. Hu, B.; Miao, H. A Lightweight SAR Ship Detection Network Based on Deep Multiscale Grouped Convolution, Network Pruning, and Knowledge Distillation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 18, 2190–2207. [Google Scholar] [CrossRef]
  31. Feng, Y.; Zhang, Y.; Zhang, X.; Wang, Y.; Mei, S. Large Convolution Kernel Network with Edge Self-Attention for Oriented SAR Ship Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 18, 2867–2879. [Google Scholar] [CrossRef]
  32. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
  33. Kang, W.; Xiang, Y.; Wang, F.; Wan, L.; You, H. Flood Detection in Gaofen-3 SAR Images via Fully Convolutional Networks. Sensors 2018, 18, 2915. [Google Scholar] [CrossRef]
  34. Deng, H.; Xu, T.; Zhou, Y.; Miao, T. Depth Density Achieves a Better Result for Semantic Segmentation with the Kinect System. Sensors 2020, 20, 812. [Google Scholar] [CrossRef] [PubMed]
  35. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar] [CrossRef]
  36. Wang, J.; Wang, S. Flood Inundation Region Extraction Method Based on Sentinel-1SAR Data. J. Catastrophology 2021, 36, 214–220. [Google Scholar] [CrossRef]
  37. Song, H.; Wu, H.; Huang, J.; Zhong, H.; He, M.; Su, M.; Yu, G.; Wang, M.; Zhang, J. Ha-unet: A modified unet based on hybrid attention for urban water extraction in sar images. Electronics 2022, 11, 3787. [Google Scholar] [CrossRef]
  38. Xu, C.; Zhang, S.; Zhao, B.; Liu, C.; Sui, H.; Yang, W.; Mei, L. SAR image water extraction using the attention U-net and multi-scale level set method: Flood monitoring in South China in 2020 as a test case. Geo-Spat. Inf. Sci. 2021, 25, 155–168. [Google Scholar] [CrossRef]
  39. Roy, A.G.; Navab, N.; Wachinger, C. Concurrent spatial and channel ‘squeeze & excitation’ in fully convolutional networks. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2018: 21st International Conference, Granada, Spain, 16–20 September 2018; pp. 421–429, Part I. [Google Scholar] [CrossRef]
  40. Wang, J.; Wang, S.; Wang, F.; Zhou, Y.; Wang, Z.; Ji, J.; Xiong, Y.; Zhao, Q. FWENet: A deep convolutional neural network for flood water body extraction based on SAR images. Int. J. Digit. Earth 2022, 15, 345–361. [Google Scholar] [CrossRef]
  41. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar]
  42. Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 11531–11539. [Google Scholar] [CrossRef]
  43. Sifre, L.; Mallat, S. Rigid-motion scattering for texture classification. Arxiv 2014, arXiv:1403.1687. [Google Scholar]
  44. Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar] [CrossRef]
  45. Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-like pure transformer for medical image segmentation. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2022; pp. 205–218. [Google Scholar] [CrossRef]
  46. Chen, Y.; Zou, B.; Guo, Z.; Huang, Y.; Huang, Y.; Qin, F.; Li, Q.; Wang, C. SCUNet++: Swin-UNet and CNN Bottleneck Hybrid Architecture with Multi-Fusion Dense Skip Connection for Pulmonary Embolism CT Image Segmentation*. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2024; pp. 7744–7752. [Google Scholar] [CrossRef]
  47. Lei, Y.; Zhou, J.; Yao, T.; Bird, B.W.; Yu, Y.; Wang, S.; Yang, K.; Zhang, Y.; Zhai, J.; Dai, Y. Overflow of Siling Co on the central Tibetan Plateau and its environmental impacts. Sci. Bull. 2024, 69, 2829–2832. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Overall structure of the proposed LGFUNet.
Figure 1. Overall structure of the proposed LGFUNet.
Sensors 25 03814 g001
Figure 2. Swin-Transformer block.
Figure 2. Swin-Transformer block.
Sensors 25 03814 g002
Figure 3. DECASPP structure.
Figure 3. DECASPP structure.
Sensors 25 03814 g003
Figure 4. The LGFF structure.
Figure 4. The LGFF structure.
Sensors 25 03814 g004
Figure 5. Examples from the SAR sample dataset.
Figure 5. Examples from the SAR sample dataset.
Sensors 25 03814 g005
Figure 6. Examples from the test dataset.
Figure 6. Examples from the test dataset.
Sensors 25 03814 g006
Figure 7. Performance comparison of different models for small lake water extraction. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Figure 7. Performance comparison of different models for small lake water extraction. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Sensors 25 03814 g007
Figure 8. Performance comparison of different models for water extraction in shadow areas. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Figure 8. Performance comparison of different models for water extraction in shadow areas. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Sensors 25 03814 g008
Figure 9. Performance comparison of different models in extracting the detailed boundaries of spatially continuous water bodies. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Figure 9. Performance comparison of different models in extracting the detailed boundaries of spatially continuous water bodies. (a(1)a(4)) represent the SAR images; (b(1)b(4)) depict the corresponding ground truth features for the SAR images; (c(1)c(4)), (d(1)d(4)), (e(1)e(4)), and (f(1)f(4)) display the extraction results of U-Net, Swin-UNet, SCUNet++, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Sensors 25 03814 g009
Figure 10. Performance comparison of different models in the ablation experiment. (a(1)a(3)) represent the SAR images; (b(1)b(3)) depict the corresponding ground truth features for the SAR images; (c(1)c(3)), (d(1)d(3)), (e(1)e(3)), and (f(1)f(3)) display the extraction results of Swin-UNet, Swin-UNet+DECASPP, Swin-UNet+LGFF, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Figure 10. Performance comparison of different models in the ablation experiment. (a(1)a(3)) represent the SAR images; (b(1)b(3)) depict the corresponding ground truth features for the SAR images; (c(1)c(3)), (d(1)d(3)), (e(1)e(3)), and (f(1)f(3)) display the extraction results of Swin-UNet, Swin-UNet+DECASPP, Swin-UNet+LGFF, and LGFUNet, respectively. In the extraction results, the black and white regions denote non-water and water bodies, respectively. The red circles highlight key areas where prediction errors occur.
Sensors 25 03814 g010
Figure 11. The monthly variations in Siling Co’s surface area and average precipitation from July to November 2018.
Figure 11. The monthly variations in Siling Co’s surface area and average precipitation from July to November 2018.
Sensors 25 03814 g011
Figure 12. Line chart of the area of Siling Co Lake and precipitation from 2017 to 2024.
Figure 12. Line chart of the area of Siling Co Lake and precipitation from 2017 to 2024.
Sensors 25 03814 g012
Table 1. Basic information for the data sources.
Table 1. Basic information for the data sources.
Sentinel-1AParameter
Product formatSLC
Product levelLevel-1
Radar wave frequency5.4 GHz
Beam modeInterferometric wide swath
PolarizationVV
Resolution13.9 m × 2.3 m
Image size20,148 × 15,774
Table 2. The confusion matrix.
Table 2. The confusion matrix.
Prediction
WaterBackground
Ground TruthWaterTrue Positive (TP)False Negative (FN)
BackgroundFalse Positive (FP)True Negative (TN)
Table 3. Metrics of the models for the test dataset.
Table 3. Metrics of the models for the test dataset.
U-NetSwin-UNetSCUNet++LGFUNet
OA(%)96.8798.3998.9899.31
Precision (%)93.2795.2495.6096.31
Recall (%)86.4890.8990.9092.50
F1-score (%)87.4392.5292.4294.04
IoU (%)80.9598.3998.9899.31
Kappa (%)85.3295.2495.6096.31
Table 4. Model metrics for the images in Figure 7.
Table 4. Model metrics for the images in Figure 7.
U-NetSwin-UNetSCUNet++LGFUNet
OA(%)99.0399.1399.2199.25
Precision (%)97.5592.0694.9794.25
Recall (%)69.1977.4676.1279.28
F1-score (%)79.5183.3283.1085.62
IoU (%)67.7572.4172.7375.57
Kappa (%)79.0282.8782.7085.24
Table 5. Model metrics for the images in Figure 8.
Table 5. Model metrics for the images in Figure 8.
U-NetSwin-UNetSCUNet++LGFUNet
OA(%)98.5299.3898.9299.62
Precision (%)93.5998.6092.3299.17
Recall (%)91.1193.3393.0794.06
F1-score (%)92.1095.8392.3096.51
IoU (%)85.3892.1286.3893.36
Kappa (%)88.5594.6189.5396.31
Table 6. Model metrics for the images in Figure 9.
Table 6. Model metrics for the images in Figure 9.
U-NetSwin-UNetSCUNet++LGFUNet
OA(%)91.8091.9296.8998.31
Precision (%)98.4397.4297.9897.47
Recall (%)75.3567.3279.8495.05
F1-score (%)81.1977.4786.7496.20
IoU (%)74.3266.5378.8692.74
Kappa (%)77.8373.2684.6694.98
Table 7. Comparison of the parameter amounts and training times of various models.
Table 7. Comparison of the parameter amounts and training times of various models.
ModelsNumber of Trainable ParametersTraining Time per Epoch (s)
U-net3.1 × 107667
Swin-Unet4.2 × 107403
SCUNet++6.3 × 107629
LGFUNet7.2 × 107460
Table 8. Metrics for the ablation experiment.
Table 8. Metrics for the ablation experiment.
Swin-UnetSwin-Unet+DECASPPSwin-Unet+LGFFLGFUnet
OA(%)98.3999.1698.9799.31
Precision (%)95.2494.9895.1796.31
Recall (%)90.8991.9491.3192.50
F1-score (%)92.5293.0692.7894.04
IoU (%)87.0387.7587.3589.48
Kappa (%)91.6292.3291.9293.40
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bai, X.; Zhang, Y.; Wei, J. LGFUNet: A Water Extraction Network in SAR Images Based on Multiscale Local Features with Global Information. Sensors 2025, 25, 3814. https://doi.org/10.3390/s25123814

AMA Style

Bai X, Zhang Y, Wei J. LGFUNet: A Water Extraction Network in SAR Images Based on Multiscale Local Features with Global Information. Sensors. 2025; 25(12):3814. https://doi.org/10.3390/s25123814

Chicago/Turabian Style

Bai, Xiaowei, Yonghong Zhang, and Jujie Wei. 2025. "LGFUNet: A Water Extraction Network in SAR Images Based on Multiscale Local Features with Global Information" Sensors 25, no. 12: 3814. https://doi.org/10.3390/s25123814

APA Style

Bai, X., Zhang, Y., & Wei, J. (2025). LGFUNet: A Water Extraction Network in SAR Images Based on Multiscale Local Features with Global Information. Sensors, 25(12), 3814. https://doi.org/10.3390/s25123814

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop