A Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment
Highlights
- The proposed HSAUNet model achieves state-of-the-art accuracy (93.44 percent IoU) in extracting coastal aquaculture areas from satellite imagery, demonstrating superior performance, especially in complex environments near saltpans.
- The model’s novel components—the Dycross Sample Module for precise boundary delineation and the Sparse Attention Module for capturing global context—are key to its success in distinguishing spectrally similar features.
- This work provides a highly reliable, automated tool for monitoring and managing coastal resources, offering critical information for the sustainable development of both the aquaculture and salt industries.
- The architectural innovations present a valuable technical reference for the remote sensing community, advancing the capability of deep learning models in semantic segmentation tasks for complex geographical environments.
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Areas
2.2. Data and Preprocessing
2.3. Model Architecture Design
2.3.1. Sparse Attention Module
2.3.2. Dycross Sample Module
2.4. Training Details
3. Results
3.1. Evaluation Criteria
3.2. Experiment Details
3.3. Comparisons of Other Models
3.3.1. Comparative Methods
3.3.2. Experimental Results
3.4. Ablation Study
4. Discussion
4.1. The Impact of the Local and Limited Perspective
4.2. Advantages, Limitations, and Potential Improvements
4.2.1. Advantages
4.2.2. Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Duan, Y.; Li, X.; Zhang, L.; Chen, D.; Liu, S.; Ji, H. Mapping national-scale aquaculture ponds based on the Google Earth Engine in the Chinese coastal zone. Aquaculture 2020, 520, 734666. [Google Scholar] [CrossRef]
- Du, S.; Huang, H.; He, F.; Luo, H.; Yin, Y.; Li, X.; Xie, L.; Guo, R.; Tang, S. Unsupervised stepwise extraction of offshore aquaculture ponds using super-resolution hyperspectral images. Int. J. Appl. Earth Obs. Geoinf. 2023, 119, 103326. [Google Scholar] [CrossRef]
- Roy, D.P.; Wulder, M.A.; Loveland, T.R.; Ce, W.; Allen, R.G.; Anderson, M.C.; Helder, D.; Irons, J.R.; Johnson, D.M.; Kennedy, R.; et al. Landsat-8: Science and product vision for terrestrial global change research. Remote Sens. Environ. 2014, 145, 154–172. [Google Scholar] [CrossRef]
- Zhang, P.; Gui, F.; Feng, D.; Zhang, G. Remote sensing extraction of aquaculture ponds in China’s coastal zone based on random forest. J. Phys. Conf. Ser. 2024, 2863, 012018. [Google Scholar] [CrossRef]
- Zeng, Z.; Wang, D.; Tan, W.; Huang, J. Extracting aquaculture ponds from natural water surfaces around inland lakes on medium resolution multispectral images. Int. J. Appl. Earth Obs. Geoinf. 2019, 80, 13–25. [Google Scholar] [CrossRef]
- Xie, G.; Bai, X.; Peng, Y.; Li, Y.; Zhang, C.; Liu, Y.; Liang, J.; Fang, L.; Chen, J.; Men, J.; et al. Aquaculture ponds identification based on multi-feature combination strategy and machine learning from Landsat-5/8 in a typical inland lake of China. Remote Sens. 2024, 16, 2168. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computser Vision (ECCV), Munich, Germany, 8–14 September 2018; p. 818. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 848. [Google Scholar] [CrossRef] [PubMed]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; p. 3154. [Google Scholar] [CrossRef]
- Zeng, Z.; Wang, D.; Tan, W.; Yu, G.; You, J.; Lv, B.; Wu, Z. RCSANet: A full convolutional network for extracting inland aquaculture ponds from high-spatial-resolution images. Remote Sens. 2020, 13, 92. [Google Scholar] [CrossRef]
- Ai, B.; Xiao, H.; Xu, H.; Yuan, F.; Ling, M. Coastal aquaculture area extraction based on self-attention mechanism and auxiliary loss. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 16, 2250–2261. [Google Scholar] [CrossRef]
- Dang, K.B.; Nguyen, M.H.; Nguyen, D.A.; Phan, T.T.H.; Giang, T.L.; Pham, H.H.; Nguyen, T.N.; Tran, T.T.V.; Bui, D.T. Coastal wetland classification with deep u-net convolutional networks and sentinel-2 imagery: A case study at the tien yen estuary of vietnam. Remote Sens. 2020, 12, 3270. [Google Scholar] [CrossRef]
- Jiao, X.; Shi, X.; Shen, Z.; Ni, K.; Deng, Z. Automatic Extraction of Saltpans on an Amendatory Saltpan Index and Local Spatial Parallel Similarity in Landsat-8 Imagery. Remote Sens. 2023, 15, 3413. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; p. 778. [Google Scholar] [CrossRef]
- Liu, W.; Lu, H.; Fu, H.; Cao, Z. Learning to upsample by learning to sample. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; p. 6037. [Google Scholar] [CrossRef]
- Zhang, D.; Yang, Y.; Qu, F.; Liu, Y. Road Extraction from Remote Sensing Images Based on Improved Deeplabv3+ Network. In Proceedings of the 2024 4th International Conference on Computer Science and Blockchain (CCSB), Shenzhen, China, 6–8 September 2024; pp. 446–449. [Google Scholar]
- Nuradili, P.; Zhou, J.; Melgani, F. Wetland Segmentation Method for UAV Multispectral Remote Sensing Images Based on SegFormer. In Proceedings of the IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 6576–6579. [Google Scholar]
- Zhang, C.; Zhao, J.; Feng, Y. Research on semantic segmentation based on improved PSPNet. In Proceedings of the 2023 International Conference on Intelligent Perception and Computer Vision (CIPCV), Xi’an, China, 19–21 May 2023; pp. 1–6. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. Available online: https://api.semanticscholar.org/CorpusID:232352874 (accessed on 17 August 2021).













| Study Area | Satellite | Spatial Resolution | Bands | Image Date | Image Size |
|---|---|---|---|---|---|
| Yinggehai Salt Field | Sentinel-2 | 10 m | 2, 3, 4, 8 | 2 March 2025 | 10,980 × 10,980 |
| Changlu Hangu Salt Field | Sentinel-2 | 10 m | 2, 3, 4, 8 | 17 July 2024 | 10,980 × 10,980 |
| Shanwei Changsha Bay | Sentinel-2 | 10 m | 2, 3, 4, 8 | 12 October 2024 | 10,980 × 10,980 |
| Haibei Salt Field | Sentinel-2 | 10 m | 2, 3, 4, 8 | 6 November 2024 | 10,980 × 10,980 |
| Dalian Biliuhe Bay | Sentinel-2 | 10 m | 2, 3, 4, 8 | 26 June 2024 | 10,980 × 10,980 |
| Pre | Rec | F1 | IoU | OA | Kappa | |
|---|---|---|---|---|---|---|
| Swin-ViT | 96.12 ± 0.02 | 96.24 ± 0.03 | 96.21 ± 0.03 | 92.75 ± 0.03 | 96.69 ± 0.03 | 94.29 ± 0.02 |
| PSPNet | 95.84 ± 0.02 | 96.25 ± 0.02 | 96.06 ± 0.03 | 92.43 ± 0.03 | 96.64 ± 0.02 | 94.28 ± 0.03 |
| DeepLabv3+ | 96.42 ± 0.03 | 96.31 ± 0.02 | 96.35 ± 0.02 | 92.97 ± 0.02 | 96.84 ± 0.02 | 94.68 ± 0.03 |
| SegFormer | 96.15 ± 0.01 | 96.32 ± 0.02 | 96.22 ± 0.03 | 92.87 ± 0.02 | 96.72 ± 0.02 | 94.56 ± 0.01 |
| HSAUNet | 96.66 ± 0.02 | 96.51 ± 0.02 | 96.58 ± 0.03 | 93.42 ± 0.01 | 97.07 ± 0.03 | 95.02 ± 0.02 |
| Pre | Rec | F1 | IoU | OA | Kappa | |
|---|---|---|---|---|---|---|
| HSAUNet-without Dycross Sample Module | 96.32 | 96.21 | 96.39 | 93.06 | 96.81 | 94.52 |
| HSAUNet-without Sparse Attention Module | 95.63 | 95.63 | 95.72 | 92.47 | 95.95 | 93.77 |
| HSAUNet | 96.66 | 96.51 | 96.58 | 93.42 | 97.07 | 95.02 |
| Model | HSAUNet | Deeplabv3-ResNet50 | PSPNet-ReNet50 | UperNetSwin Transformer Base-Sized | Segformer-b5 |
|---|---|---|---|---|---|
| FLOPs(T) | 0.12 | 0.177 | 0.179 | 0.299 | 0.19 |
| Params(M) | 25.683 | 41.246 | 46.612 | 122.8 | 61.408 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, C.; Zhao, Y.; Li, L.; Liu, T. A Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment. Remote Sens. 2025, 17, 3897. https://doi.org/10.3390/rs17233897
Wang C, Zhao Y, Li L, Liu T. A Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment. Remote Sensing. 2025; 17(23):3897. https://doi.org/10.3390/rs17233897
Chicago/Turabian StyleWang, Chengyi, Yuyang Zhao, Lu Li, and Tianyi Liu. 2025. "A Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment" Remote Sensing 17, no. 23: 3897. https://doi.org/10.3390/rs17233897
APA StyleWang, C., Zhao, Y., Li, L., & Liu, T. (2025). A Novel U-Shaped Network Combined with a Hierarchical Sparse Attention Mechanism for Coastal Aquaculture Area Extraction in a Complex Environment. Remote Sensing, 17(23), 3897. https://doi.org/10.3390/rs17233897

