Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images
Abstract
1. Introduction
- We proposed a spatiotemporal interactive learning network for cloud removal based on multi-temporal SAR and optical remote sensing image fusion and achieved “many-to-many” cloud removal to generate cloud-free image sequences.
- In the proposed method, to address the inadequate utilization of spatiotemporal information in multi-temporal SAR and optical images, we developed a spatiotemporal information interaction module to enable full exploitation of spatiotemporal features and obtained high-fidelity cloud removal for multispectral images. Both real and simulated experiments demonstrate its effectiveness.
2. Related Work
2.1. SAR-Based Cloud Removal
2.2. GANs for Cloud Removal in Remote Sensing Images
3. Methodology
3.1. Overall Framework of the Proposed Cloud Removal Method
3.2. Multi-Temporal Spatiotemporal Feature Joint Extraction Module
3.3. Spatiotemporal Information Interaction Module
3.4. Spatiotemporal Discriminator Module
3.5. Loss Function
4. Dataset and Experimental Setup
4.1. Dataset
4.2. Implementation Details
4.3. Evaluation Metrics
- (a)
- Root Mean Square Error
- (b)
- Peak Signal-to-Noise Ratio
- (c)
- Mean Absolute Error
- (d)
- Spectral Angle Mapper
- (e)
- Structural Similarity Index Measure
4.4. Comparative Methods
5. Analysis
5.1. Convergence Analysis
5.2. Ablation Studies
6. Results
6.1. Results on Simulated Dataset
6.2. Results on Real Dataset
6.3. Results on Other Regions
7. Discussion
- (a)
- Deep Exploitation of Spatiotemporal Information
- (b)
- Optimization of SAR–optical Image Fusion
- (c)
- Temporal Consistency
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- King, M.D.; Platnick, S.; Menzel, W.P.; Ackerman, S.A.; Hubanks, P.A. Spatial and temporal distribution of clouds observed by MODIS onboard the Terra and Aqua satellites. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3826–3852. [Google Scholar] [CrossRef]
- Dian, R.; Li, S.; Sun, B.; Guo, A. Recent advances and new guidelines on hyperspectral and multispectral image fusion. Inf. Fusion 2021, 69, 40–51. [Google Scholar] [CrossRef]
- Hong, D.; Yokoya, N.; Chanussot, J.; Zhu, X.X. An augmented linear mixing model to address spectral variability for hyperspectral unmixing. IEEE Trans. Image Process. 2018, 28, 1923–1938. [Google Scholar] [CrossRef]
- Zhuang, L.; Ng, M.K.; Liu, Y. Cross-track illumination correction for hyperspectral pushbroom sensor images using low-rank and sparse representations. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5502117. [Google Scholar] [CrossRef]
- Maalouf, A.; Carré, P.; Augereau, B.; Fernandez-Maloigne, C. A bandelet-based inpainting technique for clouds removal from remotely sensed images. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2363–2371. [Google Scholar] [CrossRef]
- Li, X.; Shen, H.; Zhang, L.; Zhang, H.; Yuan, Q.; Yang, G. Recovering quantitative remote sensing products contaminated by thick clouds and shadows using multitemporal dictionary learning. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7086–7098. [Google Scholar]
- Li, X.; Wang, L.; Cheng, Q.; Wu, P.; Gan, W.; Fang, L. Cloud removal in remote sensing images using nonnegative matrix factorization and error correction. ISPRS J. Photogramm. Remote Sens. 2019, 148, 103–113. [Google Scholar] [CrossRef]
- Shen, H.; Wu, J.; Cheng, Q.; Aihemaiti, M.; Zhang, C.; Li, Z. A spatiotemporal fusion based cloud removal method for remote sensing images with land cover changes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 862–874. [Google Scholar] [CrossRef]
- Li, W.; Li, Y.; Chan, J.C.-W. Thick cloud removal with optical and SAR imagery via convolutional-mapping-deconvolutional network. IEEE Trans. Geosci. Remote Sens. 2019, 58, 2865–2879. [Google Scholar] [CrossRef]
- Hoan, N.T.; Tateishi, R. Cloud removal of optical image using SAR data for ALOS applications. Experimenting on simulated ALOS data. J. Remote Sens. Soc. Jpn. 2009, 29, 410–417. [Google Scholar]
- Chen, J.; Jönsson, P.; Tamura, M.; Gu, Z.; Matsushita, B.; Eklundh, L. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky–Golay filter. Remote Sens. Environ. 2004, 91, 332–344. [Google Scholar] [CrossRef]
- Liu, L.; Lei, B. Can SAR images and optical images transfer with each other? In Proceedings of the IGARSS 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 7019–7022. [Google Scholar]
- Fuentes Reyes, M.; Auer, S.; Merkle, N.; Henry, C.; Schmitt, M. SAR-to-optical image translation based on conditional generative adversarial networks—Optimization, opportunities and limits. Remote Sens. 2019, 11, 2067. [Google Scholar] [CrossRef]
- Mendez-Rial, R.; Calvino-Cancela, M.; Martin-Herrero, J. Anisotropic inpainting of the hypercube. IEEE Geosci. Remote Sens. Lett. 2011, 9, 214–218. [Google Scholar] [CrossRef]
- Shen, H.; Zhang, L. A MAP-based algorithm for destriping and inpainting of remotely sensed images. IEEE Trans. Geosci. Remote Sens. 2008, 47, 1492–1502. [Google Scholar] [CrossRef]
- Cheng, Q.; Shen, H.; Zhang, L.; Li, P. Inpainting for remotely sensed images with a multichannel nonlocal total variation model. IEEE Trans. Geosci. Remote Sens. 2013, 52, 175–187. [Google Scholar] [CrossRef]
- Lin, C.-H.; Tsai, P.-H.; Lai, K.-H.; Chen, J.-Y. Cloud removal from multitemporal satellite images using information cloning. IEEE Trans. Geosci. Remote Sens. 2012, 51, 232–241. [Google Scholar] [CrossRef]
- Zeng, C.; Shen, H.; Zhang, L. Recovering missing pixels for Landsat ETM+ SLC-off imagery using multi-temporal regression analysis and a regularization method. Remote Sens. Environ. 2013, 131, 182–194. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Li, Z.; Shen, H.; Cheng, Q.; Li, W.; Zhang, L. Thick cloud removal in high-resolution satellite images using stepwise radiometric adjustment and residual correction. Remote Sens. 2019, 11, 1925. [Google Scholar] [CrossRef]
- Lorenzi, L.; Melgani, F.; Mercier, G. Missing-area reconstruction in multispectral images under a compressive sensing perspective. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3998–4008. [Google Scholar] [CrossRef]
- Li, J.; Wu, Z.C.; Hu, Z.W.; Zhang, J.Q.; Li, M.L.; Mo, L.; Molinier, M. Thin cloud removal in optical remote sensing images based on generative adversarial networks and physical model of cloud distortion. ISPRS J. Photogramm. Remote Sens. 2020, 166, 373–389. [Google Scholar] [CrossRef]
- Enomoto, K.; Sakurada, K.; Wang, W.; Fukui, H.; Matsuoka, M.; Nakamura, R.; Kawaguchi, N. Filmy cloud removal on satellite imagery with multispectral conditional generative adversarial nets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 48–56. [Google Scholar]
- Ng, M.K.-P.; Yuan, Q.; Yan, L.; Sun, J. An adaptive weighted tensor completion method for the recovery of remote sensing images with missing data. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3367–3381. [Google Scholar] [CrossRef]
- Ji, T.-Y.; Yokoya, N.; Zhu, X.X.; Huang, T.-Z. Nonlocal tensor completion for multitemporal remotely sensed images’ inpainting. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3047–3061. [Google Scholar] [CrossRef]
- Lin, J.; Huang, T.-Z.; Zhao, X.-L.; Chen, Y.; Zhang, Q.; Yuan, Q. Robust thick cloud removal for multitemporal remote sensing images using coupled tensor factorization. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5406916. [Google Scholar] [CrossRef]
- Liu, J.; Musialski, P.; Wonka, P.; Ye, J. Tensor completion for estimating missing values in visual data. IEEE Transactions on Pattern Analysis and Machine Intelligence 2012, 35, 208–220. [Google Scholar] [CrossRef]
- He, W.; Yokoya, N.; Yuan, L.; Zhao, Q. Remote sensing image reconstruction using tensor ring completion and total variation. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8998–9009. [Google Scholar] [CrossRef]
- Bamler, R. Principles of synthetic aperture radar. Surv. Geophys. 2000, 21, 147–157. [Google Scholar] [CrossRef]
- Bermudez, J.D.; Happ, P.N.; Oliveira, D.A.B.; Feitosa, R.Q. SAR to optical image synthesis for cloud removal with generative adversarial networks. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, 4, 5–11. [Google Scholar] [CrossRef]
- Darbaghshahi, F.N.; Mohammadi, M.R.; Soryani, M. Cloud removal in remote sensing images using generative adversarial networks and SAR-to-optical image translation. IEEE Trans. Geosci. Remote Sens. 2021, 60, 4105309. [Google Scholar] [CrossRef]
- Xu, F.; Shi, Y.; Ebel, P.; Yu, L.; Xia, G.-S.; Yang, W.; Zhu, X.X. GLF-CR: SAR-enhanced cloud removal with global–local fusion. ISPRS J. Photogramm. Remote Sens. 2022, 192, 268–278. [Google Scholar] [CrossRef]
- Huang, B.; Li, Y.; Han, X.; Cui, Y.; Li, W.; Li, R. Cloud removal from optical satellite imagery with SAR imagery using sparse representation. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1046–1050. [Google Scholar] [CrossRef]
- Grohnfeldt, C.; Schmitt, M.; Zhu, X. A conditional generative adversarial network to fuse SAR and multispectral optical data for cloud removal from Sentinel-2 images. In Proceedings of the IGARSS 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 1726–1729. [Google Scholar]
- Meraner, A.; Ebel, P.; Zhu, X.X.; Schmitt, M. Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J. Photogramm. Remote Sens. 2020, 166, 333–346. [Google Scholar] [CrossRef] [PubMed]
- He, W.; Yokoya, N. Multi-temporal sentinel-1 and-2 data fusion for optical image simulation. ISPRS Int. J. Geo-Inf. 2018, 7, 389. [Google Scholar] [CrossRef]
- Bermudez, J.D.; Happ, P.N.; Feitosa, R.Q.; Oliveira, D.A. Synthesis of multispectral optical images from SAR/optical multitemporal data using conditional generative adversarial networks. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1220–1224. [Google Scholar] [CrossRef]
- Xia, Y.; Zhang, H.; Zhang, L.; Fan, Z. Cloud removal of optical remote sensing imagery with multitemporal SAR-optical data using X-Mtgan. In Proceedings of the IGARSS 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 3396–3399. [Google Scholar]
- Gao, J.; Yuan, Q.; Li, J.; Zhang, H.; Su, X. Cloud removal with fusion of high resolution optical and SAR images using generative adversarial networks. Remote Sens. 2020, 12, 191. [Google Scholar] [CrossRef]
- Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Sarukkai, V.; Jain, A.; Uzkent, B.; Ermon, S. Cloud removal from satellite images using spatiotemporal generator networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, 1–5 March 2020; pp. 1796–1805. [Google Scholar]
- Huang, G.-L.; Wu, P.-Y. CTGAN: Cloud transformer generative adversarial network. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 511–515. [Google Scholar]
- Ma, X.; Huang, Y.; Zhang, X.; Pun, M.-O.; Huang, B. Cloud-egan: Rethinking cyclegan from a feature enhancement perspective for cloud removal by combining cnn and transformer. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 4999–5012. [Google Scholar] [CrossRef]
- Chen, B.; Huang, B.; Chen, L.; Xu, B. Spatially and temporally weighted regression: A novel method to produce continuous cloud-free Landsat imagery. IEEE Trans. Geosci. Remote Sens. 2016, 55, 27–37. [Google Scholar] [CrossRef]
- Wen, X.; Pan, Z.; Hu, Y.; Liu, J. An effective network integrating residual learning and channel attention mechanism for thin cloud removal. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6507605. [Google Scholar] [CrossRef]
- Ji, S.; Xu, W.; Yang, M.; Yu, K. 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 221–231. [Google Scholar] [CrossRef]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning, Montreal, QC, Canada, 13 December 2014. [Google Scholar]
- Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral Normalization for Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Ebel, P.; Xu, Y.; Schmitt, M.; Zhu, X.X. SEN12MS-CR-TS: A remote-sensing data set for multimodal multitemporal cloud removal. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Veci, L.; Prats-Iraola, P.; Scheiber, R.; Collard, F.; Fomferra, N.; Engdahl, M. The Sentinel-1 Toolbox. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Quebec City, QC, Canada, 13–18 July 2014; pp. 1–3. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Wang, Z.; Liu, Q.; Meng, X.; Jin, W. Multidiscriminator Supervision-Based Dual-Stream Interactive Network for High-Fidelity Cloud Removal on Multitemporal SAR and Optical Images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6012205. [Google Scholar] [CrossRef]
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
w/o SFE | 0.0454 | 0.0155 | 33.2293 | 0.0432 | 0.9671 |
w/o TFE | 0.0426 | 0.0196 | 33.5500 | 0.0426 | 0.9593 |
w/o SFF | 0.0359 | 0.0225 | 34.1668 | 0.0458 | 0.9549 |
Proposed | 0.0261 | 0.0123 | 34.2772 | 0.0421 | 0.9740 |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
w/o SFE | 0.0682 | 0.0336 | 24.4828 | 0.0762 | 0.9005 |
w/o TFE | 0.0697 | 0.0426 | 23.8716 | 0.0766 | 0.8922 |
w/o SFF | 0.0652 | 0.0335 | 24.4379 | 0.0783 | 0.8950 |
Proposed | 0.0625 | 0.0328 | 24.8952 | 0.0755 | 0.9155 |
Method | AvgRMSE ↓ | AvgMAE ↓ | AvgPSNR ↑ | AvgSAM ↓ | AvgSSIM ↑ |
---|---|---|---|---|---|
w/o SFE | 0.0172 | 0.0115 | 36.3614 | 0.0493 | 0.9529 |
w/o TFE | 0.0172 | 0.0116 | 36.4248 | 0.0462 | 0.9553 |
w/o SFF | 0.0177 | 0.0115 | 36.1311 | 0.0446 | 0.9562 |
Proposed | 0.0165 | 0.0107 | 37.0414 | 0.0446 | 0.9566 |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.0652 | 0.0367 | 27.4452 | 0.1640 | 0.6847 |
MDS-DIN | 0.0671 | 0.0275 | 27.5020 | 0.1496 | 0.8358 |
CR-TS Net | 0.0467 | 0.0196 | 32.0026 | 0.0425 | 0.9482 |
Proposed | 0.0261 | 0.0123 | 34.2772 | 0.0421 | 0.9740 |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.1095 | 0.0676 | 19.2126 | 0.1936 | 0.6185 |
MDS-DIN | 0.1223 | 0.0580 | 18.2531 | 0.2389 | 0.6816 |
CR-TS Net | 0.0783 | 0.0396 | 23.4417 | 0.0752 | 0.8989 |
Proposed | 0.0625 | 0.0328 | 24.8952 | 0.0755 | 0.9155 |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.0349 | 0.0266 | 29.7669 | 0.0816 | 0.9174 |
MDS-DIN | 0.0461 | 0.0365 | 27.8260 | 0.0912 | 0.9124 |
CR-TS Net | 0.0183 | 0.0131 | 35.8360 | 0.0382 | 0.9673 |
Proposed | 0.0166 | 0.0118 | 36.7376 | 0.0367 | 0.9672 |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.0568 | 0.0402 | 25.2901 | 0.1649 | 0.7651 |
MDS-DIN | 0.0448 | 0.0331 | 27.2584 | 0.1409 | 0.8315 |
CR-TS Net | 0.0348 | 0.0239 | 29.7228 | 0.1018 | 0.8771 |
Proposed | 0.0326 | 0.0221 | 30.3208 | 0.0976 | 0.8823 |
Method | AvgRMSE ↓ | AvgMAE ↓ | AvgPSNR ↑ | AvgSAM ↓ | AvgSSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.0392 | 0.0290 | 28.7630 | 0.1041 | 0.8887 |
MDS-DIN | 0.0413 | 0.0311 | 29.0451 | 0.0985 | 0.9045 |
CR-TS Net | 0.0179 | 0.0112 | 36.2380 | 0.0482 | 0.9548 |
Proposed | 0.0165 | 0.0107 | 37.0414 | 0.0446 | 0.9566 |
Method | Training Time | Testing Time | Parameters |
---|---|---|---|
SAR2OPT | 58 h 7 min | 7 m 2 s | 54.462 M |
MDS-DIN | 58 h 29 min | 8 m 18 s | 54.462 M |
CR-TS Net | 76 h 6 min | 8 m 2 s | 38.474 M |
Proposed | 82 h 9 min | 9 m 4 s | 38.619 M |
Method | RMSE ↓ | MAE ↓ | PSNR ↑ | SAM ↓ | SSIM ↑ |
---|---|---|---|---|---|
SAR2OPT | 0.2292 | 0.2162 | 12.3045 | 1.8789 | 0.3392 |
MDS-DIN | 0.0529 | 0.0398 | 27.1005 | 0.2019 | 0.7857 |
CR-TS Net | 0.0431 | 0.0333 | 27.0384 | 0.1862 | 0.7739 |
Proposed | 0.0404 | 0.0314 | 28.1758 | 0.1766 | 0.7767 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, C.; Wang, Z.; Chen, L.; Meng, X. Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images. Remote Sens. 2025, 17, 2169. https://doi.org/10.3390/rs17132169
Xu C, Wang Z, Chen L, Meng X. Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images. Remote Sensing. 2025; 17(13):2169. https://doi.org/10.3390/rs17132169
Chicago/Turabian StyleXu, Chenrui, Zhenfei Wang, Liang Chen, and Xiangchao Meng. 2025. "Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images" Remote Sensing 17, no. 13: 2169. https://doi.org/10.3390/rs17132169
APA StyleXu, C., Wang, Z., Chen, L., & Meng, X. (2025). Spatiotemporal Interactive Learning for Cloud Removal Based on Multi-Temporal SAR–Optical Images. Remote Sensing, 17(13), 2169. https://doi.org/10.3390/rs17132169