Physics-Guided Multi-Representation Learning with Quadruple Consistency Constraints for Robust Cloud Detection in Multi-Platform Remote Sensing
Abstract
1. Introduction
- (1)
- InfoNCE Contrastive Loss Mechanism: Addressing high intra-class variability and inter-class similarity challenges in multi-platform cloud detection, we propose an InfoNCE-based contrastive loss function. This mechanism constructs compact intra-class structures and well-separated inter-class configurations by drawing similar pixel features closer while pushing dissimilar feature centers apart, effectively enhancing discriminative capabilities across cross-platform data. The contrastive loss maintains consistency with physical priors (cloud regions exhibit spatial continuity), significantly improving robustness of multimodal feature representations.
- (2)
- Boundary-Aware Regional Adaptive Weighted Cross-Entropy Loss: To resolve cloud boundary ambiguity and detail preservation issues in multi-platform data fusion, we design a BARA-C-E Loss module that integrates PA-CAM confidence with Euclidean distance transforms. Through morphological operations for precise boundary extraction and adaptive weight assignment, this mechanism effectively balances categorical centers and edge information across different platform data, facilitating superior recovery of cross-modal edge features and addressing deficiencies of traditional pixel-wise losses in multi-platform boundary learning.
- (3)
- Uncertainty-Aware Uncertainty-Aware Quadruple Consistency Propagation (UAQCP): Targeting cross-modal feature inconsistency problems, we construct a UAQCP encompassing structural consistency, textural consistency, RGB consistency, and physical consistency. Structural consistency constraints compel student models to learn large-scale cloud structures conforming to atmospheric fluid dynamics principles; textural consistency constraints enhance capture capabilities for high-frequency oscillatory features at cloud edges driven by turbulent diffusion; RGB consistency and physical consistency, respectively, ensure cross-network consistency of multi-channel features and Pseudo-NDVI spectral responses, effectively resolving physics-violating issues such as “cloud-snow confusion” in multi-platform environments while comprehensively exploiting complementary information from multi-source remote sensing data.
- (4)
- PA-CAM and Information Entropy Dynamic Confidence Screening Mechanism: To address pseudo-label noise propagation issues in multi-platform semi-supervised learning, we propose the Physics-guided Activation Map Dynamic Confidence (PAMC) mechanism. This approach integrates PA-CAM physical confidence with information entropy from teacher network prediction probability distributions, implementing percentile-based dynamic threshold screening. The physics-probability dual filtering significantly enhances cross-platform pseudo-label accuracy, circumventing limitations of fixed thresholds under varying imaging conditions and ensuring selective propagation of high-quality pseudo-labels in multi-platform environments.
2. Related Work
2.1. Cloud Segmentation
2.2. Mean-Teacher and Consistent Learning
2.3. Physics-Guided Remote Sensing
2.4. Multimodal Fusion
3. Methods
3.1. Overall Framework
3.2. Physics-Guided Feature Generation (PGFG)
3.2.1. Pseudo-NDVI
3.2.2. TV-L1 Multi-Representation Decomposition
3.2.3. Physics-Augmented Class Activation Map (PA-CAM)
Algorithm 1 Physics-Augmented CAM algorithm |
Require: Structure Image , Model , Ensure: Initialization; //get activation of two layers; , logit for in do //normalize the activation map; //Hadamard product; append ( end // as the logit of class; |
3.3. Hybrid Supervision with Prior Knowledge
3.3.1. Contrastive Loss
3.3.2. Boundary-Aware Regional Adaptive Weighted Cross-Entropy Loss (BARA-C-E Loss)
3.4. Uncertainty Aware Propagation
3.4.1. Dynamic Confidence Level of PA-CAM
3.4.2. Uncertainty-Aware Quadruple Consistency Propagation (UAQCP)
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Comparison Experiments
4.4. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhang, Q.; Yuan, Q.; Zeng, C.; Li, X.; Wei, Y. Missing Data Reconstruction in Remote Sensing Image With a Unified Spatial–Temporal–Spectral Deep Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4274–4288. [Google Scholar] [CrossRef]
- Ghamisi, P.; Rasti, B.; Yokoya, N.; Wang, Q.M.; Hofle, B.; Bruzzone, L.; Bovolo, F.; Chi, M.M.; Anders, K.; Gloaguen, R.; et al. Multisource and Multitemporal Data Fusion. in Remote Sensing: A Comprehensive Review of the State of the Art. IEEE Geosci. Remote Sens. Mag. 2019, 7, 6–39. [Google Scholar] [CrossRef]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
- Xu, C.; Zhan, Y.; Wang, Z.; Yang, J. Multimodal fusion based few-shot network intrusion detection system. Sci. Rep. 2025, 15, 21986. [Google Scholar] [CrossRef]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
- Papadomanolaki, M.; Vakalopoulou, M.; Karantzalos, K. A Deep Multitask Learning Framework Coupling Semantic Segmentation and Fully Convolutional LSTM Networks for Urban Change Detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7651–7668. [Google Scholar] [CrossRef]
- Dong, M.; Yang, A.; Wang, Z.; Li, D.; Yang, J.; Zhao, R. Uncertainty-aware consistency learning for semi-supervised medical image segmentation. Knowl.-Based Syst. 2025, 309, 112890. [Google Scholar] [CrossRef]
- Xue, Z.; Yang, G.; Yu, X.; Yu, A.; Guo, Y.; Liu, B.; Zhou, J. Multimodal self-supervised learning for remote sensing data land cover classification. Pattern Recognit. 2025, 157, 110959. [Google Scholar] [CrossRef]
- Han, J.; Yang, W.; Wang, Y.; Chen, L.; Luo, Z. Remote Sensing Teacher: Cross-Domain Detection Transformer With Learnable Frequency-Enhanced Feature Alignment in Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
- Wang, S.; Sun, X.; Chen, C.; Hong, D.; Han, J. Semi-Supervised Semantic Segmentation for Remote Sensing Images via Multiscale Uncertainty Consistency and Cross-Teacher–Student Attention. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–15. [Google Scholar] [CrossRef]
- Zucker, S.; Batenkov, D.; Segal Rozenhaimer, M. Physics-informed neural networks for modeling atmospheric radiative transfer. J. Quant. Spectrosc. Radiat. Transf. 2025, 331, 109253. [Google Scholar] [CrossRef]
- Yao, X.; Guo, Q.; Li, A. Cloud Detection in Optical Remote Sensing Images With Deep Semi-Supervised and Active Learning. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Xu, F.; Shi, Y.; Ebel, P.; Yang, W.; Zhu, X.X. Multimodal and Multiresolution Data Fusion for High-Resolution Cloud Removal: A Novel Baseline and Benchmark. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Xia, K.; Wang, L.; Zhou, S.; Hua, G.; Tang, W. Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Localization. Proc. IEEE Int. Conf. Comput. Vis. 2023, 10126–10135. [Google Scholar] [CrossRef]
- Wang, X.; Fan, Z.; Jiang, Z.; Yan, Y.; Yang, H. EDFF-Unet: An Improved Unet-Based Method for Cloud and Cloud Shadow Segmentation in Remote Sensing Images. Remote Sens. 2025, 17, 1432. [Google Scholar] [CrossRef]
- Li, X.; Yang, X.; Li, X.; Lu, S.; Ye, Y.; Ban, Y. GCDB-UNet: A novel robust cloud detection approach for remote sensing images. Knowl.-Based Syst. 2022, 238, 107890. [Google Scholar] [CrossRef]
- Gan, X.; Li, W.; Zhang, Y.; Long, W.; Lu, Y.; Chen, Z. Prior Information-Guided Semi-Supervised Semantic Segmentation of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–16. [Google Scholar] [CrossRef]
- Buttar, P.K.; Sachan, M.K. Semantic segmentation of clouds in satellite images based on U-Net++ architecture and attention mechanism. Expert Syst. Appl. 2022, 209, 118380. [Google Scholar] [CrossRef]
- Roy, S.K.; Deria, A.; Hong, D.; Rasti, B.; Plaza, A.; Chanussot, J. Multimodal Fusion Transformer for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–20. [Google Scholar] [CrossRef]
- Chen, S.; Fang, Z.; Wan, S.; Zhou, T.; Chetn, C.; Wang, M.; Li, Q. Geometrically aware transformer for point cloud analysis. Sci. Rep. 2025, 15, 16545. [Google Scholar] [CrossRef]
- Jonnala, N.S.; Bheemana, R.C.; Prakash, K.; Bansal, S.; Jain, A.; Pandey, V.; Faruque, M.R.I.; Al-Mugren, K.S. DSIA U-Net: Deep. shallow interaction with attention mechanism UNet for remote sensing satellite images. Sci. Rep. 2025, 15, 549. [Google Scholar] [CrossRef]
- Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv. Neural Inf. Process. Syst. 2017, 30, 1195–1204. [Google Scholar]
- Wang, J.; Ding, H.Q.; Chen, C.; He, C.; Luo, B. Semi-Supervised Remote Sensing Image Semantic Segmentation via Consistency Regularization and Average Update of Pseudo-Label. Remote Sens. 2020, 12, 3603. [Google Scholar] [CrossRef]
- Kumar, A.; Mitra, S.; Rawat, Y.S. Stable Mean Teacher for Semi-supervised Video Action Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; Volume 39, pp. 4419–4427. [Google Scholar]
- Chen, Y.; Yang, Z.; Zhang, L.; Cai, W. A semi-supervised boundary segmentation network for remote sensing images. Sci. Rep. 2025, 15, 2007. [Google Scholar] [CrossRef]
- Zhang, K.; Li, P.; Wang, J. A Review of Deep Learning-Based Remote Sensing Image Caption: Methods, Models, Comparisons and Future Directions. Remote Sens. 2024, 16, 4113. [Google Scholar] [CrossRef]
- Liu, S.; Zhang, T.; Deng, R.; Liu, X.; Liu, H. Physics-guided deep learning framework with attention for image denoising. Vis. Comput. 2025, 41, 6671–6685. [Google Scholar] [CrossRef]
- Zérah, Y.; Valero, S.; Inglada, J. Physics-constrained deep learning for biophysical parameter retrieval from Sentinel-2 images: Inversion of the PROSAIL model. Remote Sens. Environ. 2024, 312, 114309. [Google Scholar] [CrossRef]
- Wang, Y.; Gong, J.; Wu, D.L.; Ding, L. Toward Physics-Informed Neural Networks for 3-D Multilayer Cloud Mask Reconstruction. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
- Li, Z.; Zhao, W.; Du, X.; Zhou, G.; Zhang, S. Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning. Remote Sens. 2024, 16, 196. [Google Scholar] [CrossRef]
- Guan, J.; Shu, Y.; Li, W.; Song, Z.; Zhang, Y. PR-CLIP: Cross-Modal Positional Reconstruction for Remote Sensing Image–Text Retrieval. Remote Sens. 2025, 17, 2117. [Google Scholar] [CrossRef]
- Yang, X.; Li, C.; Wang, Z.; Xie, H.; Mao, J.; Yin, G. Remote Sensing Cross-Modal Text-Image Retrieval Based on Attention Correction and Filtering. Remote Sens. 2025, 17, 503. [Google Scholar] [CrossRef]
- Gitelson, A.A.; Kaufman, Y.J.; Stark, R.; Rundquist, D. Novel algorithms for remote estimation of vegetation fraction. Remote Sens. Environ. 2002, 80, 76–87. [Google Scholar] [CrossRef]
- Yin, W.; Goldfarb, D.; Osher, S. The Total Variation Regularized L1 Model for Multiscale Decomposition. Multiscale Model. Simul. 2007, 6, 190–211. [Google Scholar] [CrossRef]
- Chambolle, A.; Pock, T. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. J. Math. Imaging Vis. 2011, 40, 120–145. [Google Scholar] [CrossRef]
- Wang, H.; Wang, Z.; Du, M.; Yang, F.; Zhang, Z.; Ding, S.; Mardziel, P.; Hu, X. Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks. In Proceedings of the 2020, IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; IEEE Computer Society: Los Alamitos, CA, USA, 2020; pp. 111–119. [Google Scholar]
- Yu, L.; Wang, S.; Li, X.; Fu, C.-W.; Heng, P.-A. Uncertainty-Aware Self-Ensembling Model for Semi-Supervised 3D Left Atrium Segmentation. Lect. Notes Comput. Sci. 2019, 11765, 605–613. [Google Scholar]
- Oord, A.v.d.; Li, Y.; Vinyals, O. Representation learning with contrastive predictive coding. arXiv 2018, arXiv:1807.03748. [Google Scholar]
- Zhao, H.; Kong, X.; He, J.; Qiao, Y.; Dong, C. Efficient Image Super-Resolution Using Pixel Attention. Lect. Notes Comput. Sci. 2020, 12537, 56–72. [Google Scholar] [CrossRef]
- Zhang, Z.; Yang, S.; Liu, S.; Xiao, B.; Cao, X. Ground-Based Cloud Detection Using Multiscale Attention Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Liu, J.; Ji, S. A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction From an Open Aerial Dataset. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 6049–6058. [Google Scholar]
- Wang, Y.; Wang, H.; Shen, Y.; Fei, J.; Li, W.; Jin, G.; Wu, L.; Zhao, R.; Le, X. Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Tang, C.; Zeng, X.; Zhou, L.; Zhou, Q.; Wang, P.; Wu, X.; Ren, H.; Zhou, J.; Wang, Y. Semi-supervised medical image segmentation via hard positives oriented contrastive learning. Pattern Recognit. 2024, 146, 110020. [Google Scholar] [CrossRef]
- Shen, Z.; Cao, P.; Yang, H.; Liu, X.; Yang, J.; Zaiane, O.R. Co-training with high-confidence pseudo labels for semi-supervised medical image segmentation. arXiv 2023, arXiv:2301.04465. [Google Scholar]
- Sohn, K.; Berthelot, D.; Carlini, N.; Zhang, Z.; Zhang, H.; Raffel, C.A.; Cubuk, E.D.; Kurakin, A.; Li, C.-L. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. NeurIPS 2020, 33, 596–608. [Google Scholar]
- Yang, L.; Qi, L.; Feng, L.; Zhang, W.; Shi, Y. Revisiting weak-to-strong consistency in semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7236–7246. [Google Scholar]
- Chen, H.; Tao, R.; Fan, Y.; Wang, Y.; Wang, J.; Schiele, B.; Xie, X.; Raj, B.; Savvides, M. SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning. arXiv 2023, arXiv:2301.10921. [Google Scholar]
- Sun, B.; Yang, Y.; Yuan, W.; Zhang, L.; Cheng, M.M.; Hou, Q. CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–24 June 2024. [Google Scholar]
Method | Recall ↑ | F1-Score ↑ | Error Rate ↓ | MioU ↑ |
---|---|---|---|---|
U2PL [42] | 0.840 ± 0.012 ○ | 0.771 ± 0.012 * | 0.127 ± 0.009 * | 0.669 ± 0.008 * |
BHPC [43] | 0.778 ± 0.014 * | 0.779 ± 0.011 ○ | 0.114 ± 0.017 ○ | 0.679 ± 0.013 ○ |
UCMT [44] | 0.789 ± 0.011 * | 0.783 ± 0.016 ○ | 0.112 ± 0.018 ○ | 0.684 ± 0.014 ○ |
FixMatch [45] | 0.829 ± 0.014 ○ | 0.755 ± 0.013 * | 0.131 ± 0.007 * | 0.648 ± 0.010 * |
UniMatch [46] | 0.774 ± 0.012 * | 0.767 ± 0.010 * | 0.126 ± 0.011 * | 0.664 ± 0.018 * |
SoftMatch [47] | 0.781 ± 0.012 * | 0.775 ± 0.007 ○ | 0.116 ± 0.017 ○ | 0.674 ± 0.013 ○ |
CorrMatch [48] | 0.818 ± 0.014 ○ | 0.777 ± 0.018 ○ | 0.119 ± 0.013 ○ | 0.675 ± 0.009 ○ |
Ours | 0.854 ± 0.017 | 0.800 ± 0.013 | 0.105 ± 0.015 | 0.708 ± 0.015 |
Method | Recall ↑ | F1-Score ↑ | Error Rate ↓ | MioU ↑ |
---|---|---|---|---|
U2PL | 0.809 ± 0.017 * | 0.860 ± 0.016 * | 0.100 ± 0.014 * | 0.756 ± 0.010 * |
BHPC | 0.876 ± 0.015 ○ | 0.873 ± 0.014 ○ | 0.096 ± 0.019 ○ | 0.776 ± 0.014 ○ |
UCMT | 0.886 ± 0.012 ○ | 0.874 ± 0.007 ○ | 0.097 ± 0.011 ○ | 0.777 ± 0.011 ○ |
FixMatch | 0.786 ± 0.017 * | 0.853 ± 0.014 * | 0.103 ± 0.016 * | 0.745 ± 0.012 * |
UniMatch | 0.814 ± 0.016 * | 0.863 ± 0.014 * | 0.098 ± 0.018 ○ | 0.761 ± 0.017 * |
SoftMatch | 0.856 ± 0.018 * | 0.871 ± 0.009 ○ | 0.096 ± 0.012 ○ | 0.773 ± 0.013 ○ |
CorrMatch | 0.876 ± 0.018 ○ | 0.876 ± 0.014 ○ | 0.094 ± 0.010 ○ | 0.780 ± 0.016 ○ |
Ours | 0.901 ± 0.018 | 0.882 ± 0.015 | 0.091 ± 0.018 | 0.790 ± 0.016 |
Method | Recall ↑ | F1-Score ↑ | Error Rate ↓ | MioU ↑ |
---|---|---|---|---|
Day Time (SWIMSEG) | ||||
U2PL | 0.793 ± 0.009 * | 0.847 ± 0.017 * | 0.154 ± 0.018 ○ | 0.741 ± 0.011 ○ |
BHPC | 0.840 ± 0.014 ○ | 0.845 ± 0.018 ○ | 0.160 ± 0.016 * | 0.750 ± 0.018 * |
UCMT | 0.855 ± 0.014 * | 0.858 ± 0.012 ○ | 0.150 ± 0.014 ○ | 0.760 ± 0.017 * |
FixMatch | 0.845 ± 0.017 ○ | 0.838 ± 0.017 * | 0.170 ± 0.017 * | 0.730 ± 0.014 ○ |
UniMatch | 0.850 ± 0.010 * | 0.853 ± 0.013 * | 0.155 ± 0.010 ○ | 0.755 ± 0.009 ○ |
SoftMatch | 0.816 ± 0.006 ○ | 0.870 ± 0.019 * | 0.128 ± 0.014 ○ | 0.779 ± 0.010 * |
CorrMatch | 0.876 ± 0.016 * | 0.882 ± 0.017 * | 0.128 ± 0.009 ○ | 0.791 ± 0.015 * |
Ours | 0.924 ± 0.012 | 0.911 ± 0.021 | 0.098 ± 0.015 | 0.838 ± 0.018 |
Night Time (SWINSEG) | ||||
U2PL | 0.966 ± 0.016 ○ | 0.788 ± 0.015 * | 0.246 ± 0.007 * | 0.650 ± 0.013 * |
BHPC | 0.830 ± 0.007 * | 0.835 ± 0.013 ○ | 0.170 ± 0.012 ○ | 0.740 ± 0.019 ○ |
UCMT | 0.850 ± 0.016 * | 0.853 ± 0.017 ○ | 0.155 ± 0.017 ○ | 0.755 ± 0.017 ○ |
FixMatch | 0.835 ± 0.016 ○ | 0.822 ± 0.021 * | 0.180 ± 0.014 * | 0.720 ± 0.014 ○ |
UniMatch | 0.845 ± 0.013 * | 0.848 ± 0.010 ○ | 0.160 ± 0.011 * | 0.750 ± 0.018 ○ |
SoftMatch | 0.993 ± 0.012 ○ | 0.781 ± 0.012 * | 0.263 ± 0.011 * | 0.640 ± 0.011 * |
CorrMatch | 0.991 ± 0.011 ○ | 0.788 ± 0.018 * | 0.252 ± 0.014 * | 0.650 ± 0.013 * |
Ours | 0.931 ± 0.018 | 0.864 ± 0.013 | 0.138 ± 0.017 | 0.761 ± 0.011 |
Day + Night Time (SWINSEG) | ||||
U2PL | 0.818 ± 0.021 * | 0.846 ± 0.008 * | 0.160 ± 0.020 * | 0.738 ± 0.019 ○ |
BHPC | 0.866 ± 0.014 ○ | 0.872 ± 0.014 * | 0.137 ± 0.009 * | 0.775 ± 0.013 * |
UCMT | 0.885 ± 0.010 * | 0.878 ± 0.021 ○ | 0.134 ± 0.011 * | 0.784 ± 0.003 * |
FixMatch | 0.864 ± 0.014 ○ | 0.817 ± 0.019 * | 0.156 ± 0.015 * | 0.721 ± 0.016 * |
UniMatch | 0.867 ± 0.017 * | 0.820 ± 0.019 * | 0.150 ± 0.010 ○ | 0.724 ± 0.014 * |
SoftMatch | 0.839 ± 0.018 * | 0.867 ± 0.016 * | 0.137 ± 0.012 ○ | 0.773 ± 0.012 * |
CorrMatch | 0.893 ± 0.017 * | 0.876 ± 0.013 * | 0.137 ± 0.014 * | 0.783 ± 0.015 * |
Ours | 0.929 ± 0.013 | 0.909 ± 0.011 | 0.100 ± 0.008 | 0.835 ± 0.009 |
Method | Recall ↑ | F1-Score ↑ | Error Rate ↓ | MioU ↑ |
---|---|---|---|---|
Training on SWIMSEG, SWINSEG and test on TCCD | ||||
U2PL | 0.336 ± 0.017 * | 0.420 ± 0.009 * | 0.256 ± 0.013 * | 0.302 ± 0.009 * |
BHPC | 0.352 ± 0.018 ○ | 0.445 ± 0.012 ○ | 0.244 ± 0.005 ○ | 0.325 ± 0.015 * |
UCMT | 0.450 ± 0.014 ○ | 0.527 ± 0.011 ○ | 0.208 ± 0.020 ○ | 0.409 ± 0.011 * |
FixMatch | 0.215 ± 0.013 * | 0.269 ± 0.007 * | 0.306 ± 0.012 * | 0.186 ± 0.013 * |
UniMatch | 0.431 ± 0.014 * | 0.529 ± 0.015 * | 0.215 ± 0.007 ○ | 0.403 ± 0.012 * |
SoftMatch | 0.291 ± 0.013 * | 0.368 ± 0.010 ○ | 0.276 ± 0.012 ○ | 0.262 ± 0.016 * |
CorrMatch | 0.399 ± 0.005 ○ | 0.472 ± 0.012 ○ | 0.239 ± 0.014 ○ | 0.355 ± 0.016 ○ |
Ours | 0.553 ± 0.012 | 0.627 ± 0.014 | 0.171 ± 0.009 | 0.503 ± 0.011 |
Training on TCCD and test on SWIMSEG, SWINSEG | ||||
U2PL | 0.840 ± 0.011 ○ | 0.771 ± 0.011 * | 0.127 ± 0.009 * | 0.669 ± 0.012 * |
BHPC | 0.778 ± 0.019 * | 0.779 ± 0.017 ○ | 0.114 ± 0.010 ○ | 0.679 ± 0.018 * |
UCMT | 0.789 ± 0.018 * | 0.783 ± 0.012 ○ | 0.112 ± 0.019 ○ | 0.684 ± 0.020 ○ |
FixMatch | 0.829 ± 0.010 * | 0.755 ± 0.009 * | 0.131 ± 0.008 * | 0.648 ± 0.016 * |
UniMatch | 0.774 ± 0.023 * | 0.767 ± 0.018 * | 0.126 ± 0.018 ○ | 0.664 ± 0.019 * |
SoftMatch | 0.781 ± 0.031 * | 0.775 ± 0.028 ○ | 0.116 ± 0.022 ○ | 0.674 ± 0.033 ○ |
CorrMatch | 0.818 ± 0.026 * | 0.777 ± 0.025 ○ | 0.119 ± 0.017 ○ | 0.675 ± 0.023 * |
Ours | 0.854 ± 0.018 | 0.800 ± 0.018 | 0.105 ± 0.017 | 0.708 ± 0.019 |
Method | IoU ↑ | F_Score ↑ | Recall ↑ | Error_Rate ↓ | Dice_BG ↑ | Dice_FG ↑ |
---|---|---|---|---|---|---|
Ground-Based Domain (SWINSEG + SWIMSEG) | ||||||
Baseline (Mean-Teacher) | 78.05% | 87.53% | 85.95% | 13.28% | 75.76% | 84.67% |
+InfoNCE | 79.21% | 88.12% | 90.64% | 13.46% | 75.74% | 86.55% |
+BARA-C-E loss | 79.91% | 88.68% | 92.50% | 13.00% | 72.16% | 85.72% |
+PA-CAM | 80.23% | 88.91% | 90.18% | 12.21% | 76.24% | 86.85% |
+Pseudo-NDVI | 83.06% | 90.64% | 90.25% | 10.05% | 82.87% | 88.40% |
+TV-L1 | 83.49% | 90.90% | 92.87% | 10.01% | 82.85% | 89.11% |
Aerial-Based Domain (HRC_WHU) | ||||||
Baseline (Mean-Teacher) | 75.58% | 81.77% | 91.41% | 11.28% | 90.08% | 84.31% |
+InfoNCE | 76.86% | 85.11% | 89.05% | 10.14% | 91.21% | 85.15% |
+BARA-C-E loss | 77.10% | 92.26% | 82.55% | 9.40% | 92.04% | 85.31% |
+PA-CAM | 78.14% | 88.98% | 86.59% | 9.18% | 92.10% | 86.22% |
+Pseudo-NDVI | 78.40% | 89.34% | 86.63% | 9.14% | 92.09% | 86.27% |
+TV-L1 | 79.02% | 88.22% | 90.07% | 9.12% | 91.91% | 87.30% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, Q.; Zhang, Z.; Wang, G.; Chen, Y. Physics-Guided Multi-Representation Learning with Quadruple Consistency Constraints for Robust Cloud Detection in Multi-Platform Remote Sensing. Remote Sens. 2025, 17, 2946. https://doi.org/10.3390/rs17172946
Xu Q, Zhang Z, Wang G, Chen Y. Physics-Guided Multi-Representation Learning with Quadruple Consistency Constraints for Robust Cloud Detection in Multi-Platform Remote Sensing. Remote Sensing. 2025; 17(17):2946. https://doi.org/10.3390/rs17172946
Chicago/Turabian StyleXu, Qing, Zichen Zhang, Guanfang Wang, and Yunjie Chen. 2025. "Physics-Guided Multi-Representation Learning with Quadruple Consistency Constraints for Robust Cloud Detection in Multi-Platform Remote Sensing" Remote Sensing 17, no. 17: 2946. https://doi.org/10.3390/rs17172946
APA StyleXu, Q., Zhang, Z., Wang, G., & Chen, Y. (2025). Physics-Guided Multi-Representation Learning with Quadruple Consistency Constraints for Robust Cloud Detection in Multi-Platform Remote Sensing. Remote Sensing, 17(17), 2946. https://doi.org/10.3390/rs17172946