Semantic Segmentation of Clouds and Cloud Shadows Using State Space Models
Abstract
1. Introduction
2. Network Architecture
2.1. Backbone Architecture
2.2. VSSM Module
2.3. Mamba–Convolution Fusion Module
3. Experimental Results
3.1. Datasets
- (a)
- CloudSEN-12: This is a large-scale cloud semantic understanding dataset that covers multispectral imagery from the Sentinel-2 satellite, containing annotated information for multiple clouds and cloud shadows, and is widely distributed on all continents except Antarctica [42].The dataset has diverse band information and high-resolution image features, which provides a valuable resource for studying cloud and cloud shadow detection in complex scenes. And CloudSEN12 is a large dataset for cloud semantic understanding that consists of 9880 regions of interest (ROIs). Each ROI has five 5090 m × 5090 m image patches (IPs) collected on different dates; we manually choose the images to guarantee that each IP inside an ROI matches one of the cloud cover groups.
- (b)
- 38-Cloud: This dataset is focused on the cloud detection task of Landsat8 satellite images, which contains images of 38 scenes and their pixel-level annotations [43]. The dataset is characterized by its multispectral band configuration, which can effectively distinguish clouds from other highly reflective surface objects such as ice, snow, and buildings, thereby creating a more challenging data environment for model training. The dataset is binary and contains two classifications: cloud and background. The labeling process is performed manually by professionals, ensuring the high quality and accuracy of the labels.
- (c)
- SPARCS-Val: This dataset was created by Oregon State University in the United States to validate the performance of cloud and cloud shadow removal algorithms [44]. The dataset not only contains a variety of feature types of annotations, but also covers complex scene combinations, each scene is equipped with manually annotated finely labeled images, and seven categories such as cloud shadow, cloud shadow on water, ice and snow, and cloud are annotated in detail, which further enriches the scene diversity of model validation and provides researchers with a rich data base.
3.2. Experiments Setup
3.3. Ablation Experiments
- (a)
- Convolutional Branch: The baseline model uses only the convolutional branch, achieving an MIoU of 73.22% and an MPA (Mean Pixel Accuracy). While the convolutional branch effectively extracts local features, it struggles with capturing long-range dependencies, this limitation is not only reflected in the moderate MIoU but also in the relatively low MPA, especially for small-scale cloud regions. The low MPA indicates that the baseline model frequently misclassifies these small cloud regions as non-cloud areas, as it cannot integrate global contextual information to distinguish them from similar-textured ground objects.
- (b)
- Convolutional Branch + VSSM Branch: After adding the VSSM branch to the baseline model, the MIoU increased to 76.30%, an improvement of 3.08%, and the MPA has increased. The VSSM branch captures long-range dependencies through the state space model, significantly enhancing the model’s ability to perceive global information. The larger improvement in MPA confirms that the VSSM branch effectively addresses the baseline model’s weakness in classifying small or scattered cloud categories, which are more sensitive to MPA metrics.
- (c)
- Convolutional Branch + VSSM Branch + MC Module: With the further addition of the MC module, the MIoU increased to 78.19%, an improvement of 1.89% and MPA increased. The MC module integrates the global features from the VSSM branch and the local features from the convolutional branch, enabling cross-scale feature interaction. This further enhances feature representation and strengthens the model’s ability to interpret complex cloud and cloud shadow features.
3.4. Comparative Experiments
3.4.1. Generalization Experiments on the CloudSEN-12 Dataset
3.4.2. Generalization Experiments on the 38-Cloud Dataset
3.4.3. Generalization Experiments on the SPARCS-Val Dataset
4. Performance Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- King, M.D.; Platnick, S.; Menzel, W.P.; Ackerman, S.A.; Hubanks, P.A. Spatial and temporal distribution of clouds observed by MODIS onboard the Terra and Aqua satellites. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3826–3852. [Google Scholar] [CrossRef]
- Papageorgiou, G.; Petrakis, C.; Ioannou, N.; Zagarelou, D. Effective business planning for sustainable urban development: The case of active mobility. In Proceedings of the ECIE 2019 14th European Conference on Innovation and Entrepreneurship (2 vols), Kalamata, Greece, 19–20 September 2019; p. 759. [Google Scholar]
- McNally, A.P.; Watts, P.D. A cloud detection algorithm for high-spectral-resolution infrared sounders. Q. J. R. Meteorol. Soc. 2003, 129, 3411–3423. [Google Scholar] [CrossRef]
- Tapakis, R.; Charalambides, A.G. Equipment and methodologies for cloud detection and classification: A review. Sol. Energy 2013, 95, 392–430. [Google Scholar] [CrossRef]
- Goodman, A.H.; Henderson-Sellers, A. Cloud detection and analysis: A review of recent progress. Atmos. Res. 1988, 21, 203–228. [Google Scholar] [CrossRef]
- Kazantzidis, A.; Tzoumanikas, P.; Bais, A.; Fotopoulos, S.; Economou, G. Cloud detection and classification with the use of whole-sky ground-based images. Atmos. Res. 2012, 113, 80–88. [Google Scholar] [CrossRef]
- Zi, Y.; Xie, F.; Jiang, Z. A cloud detection method for Landsat 8 images based on PCANet. Remote Sens. 2018, 10, 877. [Google Scholar] [CrossRef]
- Tian, M. A method for building a cadastral database of villages and towns based on ArcGIS. Beijing Surv. Mapp. 2015, 6, 94–98. [Google Scholar]
- Huang, Q.; Zheng, X.J.; Liu, C. Non meteorlogic applications of meteorlogical satellite data in China. China Aerosp. 1997, 7, 14–17. [Google Scholar]
- Xiang, D.X. Research on Drought Remote Sensing Monitoring Model Based on Cloud Parameter Method. Master’s Thesis, Wuhan University, Wuhan, China, 2011. [Google Scholar]
- Xu, M.; Wang, S.H.; Guo, R.Z.; Jia, X.; Jia, S. Review of Cloud Detection and Removal Methods for Remote Sensing lmages. J. Comput. Res. Dev. 2024, 61, 1585–1607. [Google Scholar]
- Mohajerani, S.; Krammer, T.A.; Saeedi, P. Cloud detection algorithm for remote sensing images using fully convolutional neural networks. arXiv 2018, arXiv:1810.05782. [Google Scholar] [CrossRef]
- Zhang, Q.; Xiao, C. Cloud detection of RGB color aerial photographs by progressive refinement scheme. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7264–7275. [Google Scholar] [CrossRef]
- Wang, L.; Li, X.; Bao, Y.X.; Shao, Y. Research progress of remote sensing application on transportation meteorological disasters. Remote Sens. Land Resour. 2018, 30, 1–7. [Google Scholar]
- Chassery, J.M.; Garbay, C. An iterative segmentation method based on a contextual color and shape criterion. IEEE Trans. Pattern Anal. Mach. Intell. 1984, PAMI-6, 794–800. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Wang, Y.; Liu, K.J.R.; Lo, S.-C.B.; Freedman, M.T. Computerized radiographic mass detection. I. Lesion site selection by morphological enhancement and contextual segmentation. IEEE Trans. Med. Imaging 2001, 20, 289–301. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.X.; Jin, H.J.; Wang, J.L.; Jiang, W.S. Optimization Approach for Multi-scale Segmentation of Remotely SensedImagery under k-means Clustering Guidance. Cehui Xuebao 2015, 44, 526. [Google Scholar]
- Huang, Z.K.; Chau, K.W. A new image thresholding method based on Gaussian mixture model. Appl. Math. Comput. 2008, 205, 899–907. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Weng, L.; Pang, K.; Xia, M.; Lin, H.; Qian, M.; Zhu, C. Sgformer: A local and global features coupling network for semantic segmentation of land cover. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 6812–6824. [Google Scholar] [CrossRef]
- Dong, Z.; Yang, D.; Reindl, T.; Walsh, W.M. Short-term solar irradiance forecasting using exponential smoothing state space model. Energy 2013, 55, 1104–1113. [Google Scholar] [CrossRef]
- Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
- Choi, H.; Bindschadler, R. Cloud detection in Landsat imagery of ice sheets using shadow matching technique and automatic normalized difference snow index threshold value decision. Remote Sens. Environ. 2004, 91, 237–242. [Google Scholar] [CrossRef]
- McIntire, T.J.; Simpson, J.J. Arctic sea ice, cloud, water, and lead classification using neural networks and 1.6-/spl mu/m data. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1956–1972. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; part III 18; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7 June–12 June 2015; pp. 3431–3440. [Google Scholar]
- Hong, D.; Zhang, B.; Li, H.; Li, Y.; Yao, J.; Li, C.; Werner, M.; Chanussot, J.; Zipf, A.; Zhu, X.X. Cross-city matters: A multimodal remote sensing benchmark dataset for cross-city semantic segmentation using high-resolution domain adaptation networks. Remote Sens. Environ. 2023, 299, 113856. [Google Scholar] [CrossRef]
- Lu, H.Y. Research on Remote Sensing Image Water Body Extraction Based on CNN-Transformer and Semi-Supervised Adversarial Methods. Master’s Thesis, Nanjing University of Information Science & Technology, Nanjing, China, 2024. [Google Scholar]
- Cheng, P.; Xia, M.; Wang, D.; Lin, H.; Zhao, Z. Transformer Self-Attention Change Detection Network with Frozen Parameters. Appl. Sci. 2025, 15, 3349. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Ren, W.; Wang, Z.; Xia, M.; Lin, H. MFlNet: Multi-scale feature interaction network for change detection of high-resolution remote sensing images. Remote Sens. 2024, 16, 1269. [Google Scholar] [CrossRef]
- Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model. In Proceedings of the Forty-first International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024. [Google Scholar]
- He, X.; Cao, K.; Zhang, J.; Yan, K.; Wang, Y.; Li, R.; Xie, C.; Hong, D.; Zhou, M. Pan-mamba: Effective pan-sharpening with state space model. Inf. Fusion 2025, 115, 102779. [Google Scholar] [CrossRef]
- Chen, K.; Chen, B.; Liu, C.; Li, W.; Zou, Z.; Shi, Z. Rsmamba: Remote sensing image classification with state space model. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Gu, P.Z.; Liu, W.C.; Feng, S.Y.; Wei, T.Y.; Wang, J.; Chen, H. Hpn-cr: Heterogeneous parallel network for sar-optical data fusion cloud removal. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–15. [Google Scholar] [CrossRef]
- He, Z.C.; Wang, P.; Zou, Y.K.; Huang, B.; Zhu, D.Y.; Harry, F.L.; Henry, L. DADIGAN: A dual attention blocks-based disentangled iterative Generative Adversarial Network for cloud and shadow removal on SAR and optical images. Inf. Fusion 2025, 125, 103487. [Google Scholar] [CrossRef]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Jiao, J.; Liu, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Gu, A.; Dao, T.; Ermon, S.; Rudra, A.; Re, C. Hippo: Recurrent memory with optimal polynomial projections. Adv. Neural Inf. Process. Syst. 2020, 33, 1474–1487. [Google Scholar]
- Aybar, C.; Ysuhuaylas, L.; Loja, J.; Gonzales, K.; Herrera, F.; Bautista, L.; Yali, R.; Flores, A.; Diaz, L.; Cuenca, N.; et al. Cloudsen12, a global dataset for semantic understanding of cloud and cloud shadow in sentinel-2. Sci. Data 2022, 9, 782. [Google Scholar] [CrossRef]
- SMohajerani, P. Saeedi, Cloud-net: An end-to-end cloud detection algorithm for landsat 8 imagery. In Proceedings of the IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1029–1032. [Google Scholar]
- Hughes, M.J.; Hayes, D.J. Automated detection of cloud and cloud shadow in single-date Landsat imagery using neural networks and spatial post-processing. Remote Sens. 2014, 6, 4907–4926. [Google Scholar] [CrossRef]
Methods | MIoU(%) |
---|---|
Convolutional Branch | 73.22 |
Convolutional Branch + VSSM Branch | 76.3 (3.08↑) |
Convolutional Branch + VSSM Branch + MC Module | 78.19 (1.89↑) |
Architecture | Model | MIoU(%) | PA(%) | MPA(%) |
---|---|---|---|---|
CNN | FCN-32s | 71.23 | 86.98 | 84.8 |
DANet | 71.79 | 87.02 | 84.14 | |
BiSeNetV2 | 74.19 | 88.12 | 85.47 | |
PAN | 74.83 | 88.48 | 86.48 | |
CGNet | 74.98 | 88.59 | 86.23 | |
LinkNet | 75.19 | 88.63 | 86.64 | |
DenseASPP | 75.32 | 88.77 | 86.61 | |
DeeplabV3 | 75.33 | 88.71 | 86.79 | |
HRNet | 76.45 | 89.25 | 87.03 | |
OCRNet | 76.74 | 89.5 | 87.66 | |
SegNet | 77.01 | 89.62 | 87.56 | |
Transformer | SETR | 73.9 | 87.78 | 85.38 |
PVT | 76.62 | 89.03 | 86.59 | |
SwinUNet | 77.53 | 89.78 | 87.61 | |
CNN-Transformer Hybrid | CVT | 73.93 | 87.93 | 85.16 |
MPViT | 77.22 | 88.89 | 87.37 | |
DBNet | 77.37 | 89.71 | 87.4 | |
Mamba | CCViM | 74.5 | 88.1 | 85.3 |
VM-UNet | 77.13 | 89.53 | 86.29 | |
RS3Mamba | 77.91 | 90.05 | 86.34 | |
MCloud | 78.19 | 90.13 | 88.85 |
Architecture | Model | Cloud | Cloud Shadow | ||||
---|---|---|---|---|---|---|---|
P(%) | R(%) | F1(%) | P(%) | R(%) | F1(%) | ||
CNN | FCN-32s | 87.55 | 89.85 | 88.68 | 78.7 | 61.31 | 68.92 |
DANet | 88.59 | 88.44 | 88.51 | 75.51 | 66.02 | 70.44 | |
BiSeNetV2 | 90.08 | 89.81 | 89.94 | 77.26 | 71.03 | 74.01 | |
PAN | 91.43 | 88.67 | 90.02 | 79.97 | 70.48 | 74.92 | |
CGNet | 90.27 | 90.39 | 90.32 | 79 | 71.04 | 74.8 | |
LinkNet | 91.96 | 88.38 | 90.13 | 80 | 71.5 | 75.51 | |
DenseASPP | 91.44 | 89.23 | 90.32 | 79.71 | 71.36 | 75.3 | |
DeeplabV3 | 89.79 | 90.66 | 90.22 | 81.03 | 70.98 | 75.67 | |
HRNet | 91.35 | 90.04 | 90.69 | 80 | 74.27 | 77.02 | |
OCRNet | 91.44 | 90.32 | 90.87 | 81.91 | 72.8 | 77.08 | |
SegNet | 91.22 | 90.63 | 90.92 | 81.5 | 73.14 | 77.09 | |
Transformer | SETR | 88.92 | 89.96 | 89.43 | 78.07 | 71.29 | 74.52 |
PVT | 89.67 | 90.43 | 90.04 | 78.98 | 72.34 | 75.51 | |
SwinUNet | 91.16 | 91.19 | 91.17 | 80.88 | 75.96 | 78.34 | |
CNN-Transformer Hybrid | CVT | 89.62 | 89.44 | 89.52 | 76.55 | 71.44 | 73.9 |
MPViT | 91.66 | 89.67 | 90.65 | 78.28 | 73.79 | 75.96 | |
DBNet | 91.7 | 90.51 | 91.1 | 80.03 | 76.17 | 78.05 | |
Mamba | CCViM | 89.5 | 88.2 | 88.8 | 75.4 | 74.1 | 74.7 |
VM-UNet | 90.08 | 91.72 | 90.9 | 76.1 | 80.13 | 78.12 | |
RS3Mamba | 90.93 | 91.76 | 91.34 | 74.63 | 83.02 | 78.83 | |
MCloud | 92.15 | 92.50 | 92.32 | 83.00 | 82.50 | 82.75 |
Architecture | Model | MIoU(%) | PA(%) | MPA(%) |
---|---|---|---|---|
CNN | DANet | 87.69 | 93.44 | 93.45 |
FCN-32s | 88.67 | 94 | 93.99 | |
BiSeNetV2 | 91.28 | 95.44 | 95.45 | |
LinkNet | 91.48 | 95.55 | 95.55 | |
DenseASPP | 91.62 | 95.62 | 95.63 | |
PAN | 91.69 | 95.66 | 95.66 | |
DeeplabV3 | 91.86 | 95.75 | 95.77 | |
CGNet | 92.24 | 95.96 | 95.98 | |
PSPNet | 92.34 | 96.02 | 96.01 | |
SegNet | 92.58 | 96.14 | 96.16 | |
HRNet | 92.63 | 96.17 | 96.17 | |
CDUNet | 92.64 | 96.18 | 96.19 | |
OCRNet | 92.69 | 96.2 | 96.21 | |
Transformer | SETR | 82.65 | 90.5 | 90.51 |
SwinUNet | 93.1 | 96.42 | 96.42 | |
CNN-Transformer Hybrid | CVT | 87.92 | 93.57 | 93.56 |
MPViT | 92.86 | 95.96 | 95.97 | |
DBNet | 93.27 | 96.52 | 96.51 | |
Mamba | RS3Mamba | 93 | 96.38 | 96.38 |
VM-UNet | 93.5 | 96.85 | 96.88 | |
CCViM | 94.1 | 97.15 | 97.2 | |
MCloud | 94.6 | 97.58 | 97.62 |
Architecture | Model | Cloud | Background | ||||
---|---|---|---|---|---|---|---|
P(%) | R(%) | F1(%) | P(%) | R(%) | F1(%) | ||
CNN | DANet | 93.69 | 93.08 | 93.38 | 93.2 | 93.79 | 93.5 |
FCN-32s | 93.78 | 94.17 | 93.98 | 94.21 | 93.82 | 94.02 | |
BiSeNetV2 | 94.68 | 96.24 | 95.46 | 96.22 | 94.65 | 95.43 | |
LinkNet | 95.63 | 95.41 | 95.52 | 95.47 | 95.68 | 95.58 | |
DenseASPP | 95.35 | 95.87 | 95.61 | 95.9 | 95.38 | 95.64 | |
PAN | 95.33 | 95.99 | 95.66 | 96 | 95.35 | 95.67 | |
DeeplabV3 | 94.75 | 96.83 | 95.79 | 96.79 | 94.69 | 95.74 | |
CGNet | 94.88 | 96.12 | 95.49 | 97.08 | 94.81 | 95.95 | |
PSPNet | 95.44 | 96.61 | 96.02 | 96.61 | 95.43 | 96.02 | |
SegNet | 96.22 | 96.02 | 96.12 | 96.07 | 96.27 | 96.17 | |
HRNet | 96.02 | 96.29 | 96.16 | 96.32 | 96.06 | 96.19 | |
CDUNet | 95.52 | 96.86 | 96.19 | 96.85 | 95.5 | 96.18 | |
OCRNet | 96.48 | 95.87 | 96.17 | 95.94 | 96.54 | 96.24 | |
Transformer | SETR | 89.86 | 91.2 | 90.53 | 91.16 | 89.82 | 90.49 |
SwinUNet | 96.4 | 96.41 | 96.41 | 96.45 | 96.44 | 96.44 | |
CNN-Transformer Hybrid | CVT | 93.51 | 93.56 | 93.54 | 93.62 | 93.58 | 93.6 |
MPViT | 95.97 | 95.93 | 95.94 | 95.23 | 95.74 | 95.48 | |
DBNet | 96.82 | 96.16 | 96.49 | 96.22 | 96.67 | 96.44 | |
Mamba | RS3Mamba | 95.99 | 96.75 | 96.37 | 96.76 | 96.01 | 96.38 |
VM-UNet | 96.82 | 96.5 | 96.66 | 96.92 | 96.8 | 96.86 | |
CCViM | 97.1 | 96.95 | 97.03 | 97.3 | 97.1 | 97.20 | |
MCloud | 97.5 | 97.2 | 97.35 | 97.8 | 97.3 | 97.55 |
Model | Overall Data | ||||
---|---|---|---|---|---|
MIoU(%) | PA(%) | MPA(%) | R(%) | F1(%) | |
DANet | 55.61 | 85.04 | 70.28 | 67.12 | 66.76 |
FCN-32s | 61.38 | 88.03 | 75.41 | 71.2 | 72.4 |
BiSeNetV2 | 64.38 | 88.57 | 80.33 | 73.26 | 75.8 |
SegNet | 65.86 | 89.3 | 80.74 | 75.18 | 77.53 |
CGNet | 66.82 | 89.93 | 80 | 76.37 | 77.31 |
PSPNet | 67.23 | 89.92 | 82.5 | 75.23 | 77.81 |
DenseASPP | 67.73 | 89.81 | 82.42 | 76.21 | 78.63 |
DeeplabV3 | 68.26 | 90.06 | 82.94 | 76.9 | 79.05 |
LinkNet | 68.62 | 90.84 | 83.38 | 76.8 | 79.24 |
HRNet | 69.74 | 90.98 | 84.61 | 77.3 | 80.51 |
OCRNet | 69.91 | 90.94 | 86.21 | 77.15 | 80.04 |
SETR | 63.59 | 87.73 | 79.58 | 72.89 | 75.38 |
PVT | 68.54 | 89.28 | 83.02 | 77.57 | 80.2 |
SwinUNet | 73.0 | 91.86 | 86.44 | 80.44 | 83.1 |
CVT | 62.68 | 87.03 | 78.3 | 72.42 | 74.6 |
MPViT | 72.98 | 90.02 | 85.26 | 79.49 | 82.27 |
DBNet | 74.04 | 92.54 | 87.26 | 81.01 | 83.65 |
VM-UNet | 74.9 | 92.14 | 88.24 | 81.95 | 84.59 |
RS3Mamba | 75.36 | 93.14 | 86.86 | 83.52 | 84.98 |
CCViM | 76.46 | 93.02 | 89.1 | 83.17 | 85.61 |
MCloud | 77.47 | 93.77 | 88.23 | 85.06 | 86.5 |
Model | Class Pixel Accuracy (%) | ||||||
---|---|---|---|---|---|---|---|
CS | CSOW | W | I/S | L | C | F | |
DANet | 55.44 | 30.37 | 89.21 | 87.63 | 91.07 | 76.89 | 61.36 |
FCN-32s | 64.63 | 37.51 | 89.57 | 89.56 | 92.36 | 83.1 | 71.17 |
BiSeNetV2 | 73.24 | 57.35 | 90.2 | 92.17 | 91.65 | 83.18 | 74.52 |
SegNet | 76.98 | 55.74 | 86.82 | 92.49 | 92.66 | 83.5 | 77.02 |
CGNet | 72.47 | 50.32 | 93 | 90.53 | 94.02 | 84.17 | 75.49 |
PSPNet | 76.72 | 56.59 | 93.06 | 92.59 | 92.75 | 83.81 | 82.02 |
DenseASPP | 77.13 | 57.26 | 94.13 | 93.23 | 93.46 | 81.2 | 80.58 |
DeeplabV3 | 79.87 | 60.99 | 91.28 | 91.56 | 93.71 | 81.97 | 81.21 |
LinkNet | 79.3 | 60.06 | 87.36 | 91.46 | 93.2 | 88.7 | 83.58 |
HRNet | 82.77 | 65.31 | 89.36 | 93.22 | 93.23 | 85.91 | 82.52 |
OCRNet | 81.64 | 68.4 | 94.27 | 93.53 | 92.46 | 87.4 | 85.78 |
SETR | 71.16 | 55.83 | 89.85 | 92.31 | 91.89 | 78.9 | 77.14 |
PVT | 76.34 | 62.57 | 90.42 | 93.23 | 92.46 | 84.6 | 81.55 |
SwinUNet | 80.14 | 70.09 | 92.34 | 94.17 | 94.15 | 88.27 | 85.94 |
CVT | 67.11 | 52.85 | 88.78 | 93.11 | 91.64 | 77.79 | 76.85 |
MPViT | 80.42 | 68.95 | 91.03 | 93.77 | 92.96 | 84.56 | 85.18 |
DBNet | 81.74 | 70.21 | 93.3 | 94.15 | 94.44 | 89.38 | 87.62 |
VM-UNet | 82.43 | 75.21 | 94.15 | 94.88 | 95.04 | 85.22 | 90.76 |
RS3Mamba | 85.66 | 70.69 | 93.5 | 94.41 | 95.25 | 89.84 | 78.71 |
CCViM | 83.39 | 77.55 | 94.27 | 93.95 | 95.23 | 89.01 | 90.3 |
MCloud | 81.95 | 74.38 | 95.34 | 94.31 | 95.86 | 92.06 | 83.74 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Z.; Hu, Z.; Xia, M.; Yan, Y.; Zhang, R.; Liu, S.; Li, T. Semantic Segmentation of Clouds and Cloud Shadows Using State Space Models. Remote Sens. 2025, 17, 3120. https://doi.org/10.3390/rs17173120
Zhang Z, Hu Z, Xia M, Yan Y, Zhang R, Liu S, Li T. Semantic Segmentation of Clouds and Cloud Shadows Using State Space Models. Remote Sensing. 2025; 17(17):3120. https://doi.org/10.3390/rs17173120
Chicago/Turabian StyleZhang, Zhixuan, Ziwei Hu, Min Xia, Ying Yan, Rui Zhang, Shengyan Liu, and Tao Li. 2025. "Semantic Segmentation of Clouds and Cloud Shadows Using State Space Models" Remote Sensing 17, no. 17: 3120. https://doi.org/10.3390/rs17173120
APA StyleZhang, Z., Hu, Z., Xia, M., Yan, Y., Zhang, R., Liu, S., & Li, T. (2025). Semantic Segmentation of Clouds and Cloud Shadows Using State Space Models. Remote Sensing, 17(17), 3120. https://doi.org/10.3390/rs17173120