A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series
Abstract
:1. Introduction
- A novel convolution and attention feature extractor (CAFE) is designed to obtain convolutional and attentive features and highlight underlying key features that have greater impacts in making regression and reconstruction;
- A deep neural network of CAFE-based, the Gate Recurrent Unit (GRU) model stack, is proposed for cross-variable environmental image reconstruction, which can capture complex spatiotemporal correlations on the scales of spatial, temporal, and variable;
- A novel loss function, multivariate dynamic time warping (MDTW), is designed for reconstruction of multi-step images of target variables via supervised learning of the source variable;
- The proposed architecture achieves cross-variable reconstruction on real-world remote sensing image series. The experimental results validate the superior performance of the proposed method;
- The remaining sections of this article are as follows. Section 2 introduces the related work. Section 3 provides the preliminaries and study materials of this work. Section 4 presents the developed deep neural network model with the proposed feature extractor, neural network architecture, and loss function in detail. Subsequently, Section 5 demonstrates the experiments, and Section 6 discusses the knowledge provided by the experimental results. Finally, the last section concludes this work.
2. Related Work
2.1. Image Series Reconstruction of Environmental Variables
2.2. Dynamic Time Warping
3. Materials
3.1. Preliminaries
3.2. Study Area and Data Material
4. Methods
4.1. Convolution and Attention Feature Extractor
4.2. CAFE-Based Recurrent Neural Network
4.3. MDTW Loss Function
5. Results
5.1. Experimental Setup
- ResNet: A residual neural network (ResNet) integrates residual functions in weight layers with reference to inputs;
- ConvLSTM: ConvLSTM combines a convolutional layer with a long short-term memory network, which is used for feature extraction and to capture both spatial and temporal dependencies in image series;
- ConvGRU: ConvGRU combines convolutional layers with a GRU network, which is used for feature extraction and to capture both spatial and temporal dependencies in image series;
- CSA-ConvLSTM: A convolutional self-attention (CSA) mechanism is integrated into a ConvLSTM neural network for forecasting environmental image series.
5.2. Experimental Results
6. Discussion
- ConvGRU: The basic backbone with convolutional and GRU neural network with a loss function of basic mean absolute error;
- ConvGRU with MDTW: The convolutional and GRU neural network with the proposed MDTW loss function;
- CAFE-GRUS: The proposed CAFE-GRUS model with a loss function of basic mean absolute error;
- CAFE-GRUS with MDTW: The full version of the proposed CAFE-GRUS model with the proposed MDTW loss function.
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y.; Xu, H.; Tan, W.; Yang, Q.; Wang, J.; et al. Deep learning in environmental remote sensing: Achievements and challenges. Remote Sens. Environ. 2020, 241, 111716. [Google Scholar]
- Zeng, C.; Shen, H.; Zhong, M.; Zhang, L.; Wu, P. Reconstructing MODIS LST based on multitemporal classification and robust regression. IEEE Geosci. Remote Sens. Lett. 2014, 12, 512–516. [Google Scholar] [CrossRef]
- Zeng, C.; Long, D.; Shen, H.; Wu, P.; Cui, Y.; Hong, Y. A two-step framework for reconstructing remotely sensed land surface temperatures contaminated by cloud. ISPRS J. Photogramm. Remote Sens. 2018, 141, 30–45. [Google Scholar] [CrossRef]
- Hong, F.; Zhan, W.; Göttsche, F.M.; Liu, Z.; Zhou, J.; Huang, F.; Lai, J.; Li, M. Comprehensive assessment of four-parameter diurnal land surface temperature cycle models under clear-sky. ISPRS J. Photogramm. Remote Sens. 2018, 142, 190–204. [Google Scholar] [CrossRef]
- Li, J.; Yu, Z.; Yu, L.; Cheng, P.; Chen, J.; Chi, C. A Comprehensive Survey on SAR ATR in Deep-Learning Era. Remote Sens. 2023, 15, 1454. [Google Scholar]
- Zhang, L.; Zhang, L.; Du, B. Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geosci. Remote Sens. Mag. 2016, 4, 22–40. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 1–9. [Google Scholar]
- Shi, X.; Gao, Z.; Lausen, L.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Deep learning for precipitation nowcasting: A benchmark and a new model. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Malek, S.; Melgani, F.; Bazi, Y.; Alajlan, N. Reconstructing cloud-contaminated multispectral images with contextualized autoencoder neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 56, 2270–2282. [Google Scholar] [CrossRef]
- Zhang, Q.; Yuan, Q.; Zeng, C.; Li, X.; Wei, Y. Missing data reconstruction in remote sensing image with a unified spatial–temporal–spectral deep convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4274–4288. [Google Scholar] [CrossRef] [Green Version]
- Li, T.; Gu, Y. Progressive spatial–spectral joint network for hyperspectral image reconstruction. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–4. [Google Scholar] [CrossRef]
- Zhang, W.; Li, J.; Hua, Z. Attention-based tri-UNet for remote sensing image pan-sharpening. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3719–3732. [Google Scholar] [CrossRef]
- Ebel, P.; Xu, Y.; Schmitt, M.; Zhu, X.X. SEN12MS-CR-TS: A remote-sensing data set for multimodal multitemporal cloud removal. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–4. [Google Scholar] [CrossRef]
- Bellman, R.; Kalaba, R. On adaptive control processes. IRE Trans. Autom. Control. 1959, 4, 1–9. [Google Scholar] [CrossRef]
- Fan, L.; Yang, J.; Sun, X.; Zhao, F.; Liang, S.; Duan, D.; Chen, H.; Xia, L.; Sun, J.; Yang, P. The effects of Landsat image acquisition date on winter wheat classification in the North China Plain. ISPRS J. Photogramm. Remote Sens. 2022, 187, 1–13. [Google Scholar] [CrossRef]
- Lu, F.; Chen, B.; Zhou, X.D.; Song, D. STA-VPR: Spatio-temporal alignment for visual place recognition. IEEE Robot. Autom. Lett. 2021, 6, 4297–4304. [Google Scholar] [CrossRef]
- Cai, X.; Xu, T.; Yi, J.; Huang, J.; Rajasekaran, S. Dtwnet: A dynamic time warping network. Adv. Neural Inf. Process. Syst. 2019, 32, 2–11. [Google Scholar]
- Cuturi, M.; Blondel, M. Soft-dtw: A differentiable loss function for time-series. In Proceedings of the International Conference on Machine Learning PMLR, New York, NY, USA, 20–22 April 2017; pp. 894–903. [Google Scholar]
- Rath, T.M.; Manmatha, R. Lower-bounding of dynamic time warping distances for multivariate time series. Univ. Mass. Amherst Tech. Rep. MM 2002, 40, 1–4. [Google Scholar]
- Shokoohi-Yekta, M.; Hu, B.; Jin, H.; Wang, J.; Keogh, E. Generalizing DTW to the multi-dimensional case requires an adaptive approach. Data Min. Knowl. Discov. 2017, 31, 1–31. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, X.; Gao, Y.; Lin, J.; Lu, C.T. Tapnet: Multivariate time series classification with attentional prototypical network. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 3 April 2020; Volume 34, pp. 6845–6852. [Google Scholar]
- Shen, D.S.; Chi, M. TC-DTW: Accelerating multivariate dynamic time warping through triangle inequality and point clustering. Inf. Sci. 2023, 621, 611–626. [Google Scholar] [CrossRef]
- Li, H.; Wan, J.; Liu, S.; Sheng, H.; Xu, M. Wetland Vegetation Classification through Multi-Dimensional Feature Time Series Remote Sensing Images Using Mahalanobis Distance-Based Dynamic Time Warping. Remote Sens. 2022, 14, 501. [Google Scholar] [CrossRef]
- The Regional NCOM AMSEAS 2D Dataset. Available online: https://www.ncei.noaa.gov/erddap/griddap/NCOM_amseas_latest2d.html (accessed on 22 May 2023).
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Dataset | Year Range | Sample Number | Percentage |
---|---|---|---|
Training | 2015–2019 | 14,202 | 72% |
Validation | 2020–2021 | 2643 | 13% |
Test | 2021–2022 | 2915 | 15% |
Compared Model | STF to SSF 10−6 Psu-m/s | SSF to STF 10−6 °C-m/s | STF to SAP Pa | SAP to STF 10−6 °C-m/s | SSF to SAP Pa | SAP to SSF 10−6 Psu-m/s |
---|---|---|---|---|---|---|
ConvLSTM | 10.65 ± 4.03 | 12.93 ± 6.11 | 103 ± 26 | 13.34 ± 6.99 | 121 ± 32 | 13.94 ± 7.97 |
ConvGRU | 8.85 ± 2.91 | 10.32 ± 4.82 | 94 ± 23 | 12.93 ± 5.28 | 98 ± 25 | 12.39 ± 7.38 |
ResNet | 6.07 ± 2.08 | 7.99 ± 3.59 | 78 ± 13 | 10.32 ± 4.82 | 89 ± 12 | 11.42 ± 6.94 |
CSAResNet | 5.68 ± 1.94 | 7.93 ± 2.73 | 73 ± 12 | 8.93 ± 4.32 | 87 ± 13 | 9.22 ± 5.20 |
CAFE-GRUS | 3.43 ± 0.91 | 5.57 ± 1.35 | 55 ± 6 | 6.20 ± 3.97 | 68 ± 8 | 7.93 ± 4.82 |
CAFE-GRUS | STF to SSF 10−6 Psu-m/s | SSF to STF 10−6 °C-m/s | STF to SAP Pa | SAP to STF 10−6 °C-m/s | SSF to SAP Pa | SAP to SSF 10−6 Psu-m/s |
---|---|---|---|---|---|---|
Time Step 1 | 4.11 ± 0.98 | 5.63 ± 1.55 | 59 ± 9 | 6.24 ± 4.32 | 69 ± 10 | 8.03 ± 5.12 |
Time Step 2 | 3.45 ± 0.92 | 5.58 ± 1.43 | 59 ± 7 | 6.22 ± 3.98 | 69 ± 9 | 7.97 ± 4.96 |
Time Step 3 | 3.44 ± 0.92 | 5.59 ± 1.38 | 55 ± 7 | 6.21 ± 3.95 | 66 ± 7 | 7.99 ± 4.81 |
Time Step 4 | 3.40 ± 0.91 | 5.57 ± 1.38 | 54 ± 6 | 6.18 ± 3.95 | 64 ± 8 | 7.93 ± 4.82 |
Time Step 5 | 3.37 ± 0.88 | 5.50 ± 1.32 | 52 ± 5 | 6.12 ± 3.90 | 62 ± 7 | 7.91 ± 4.71 |
Compared Model | STF to SSF 10−6 Psu-m/s | SSF to STF 10−6 °C-m/s | STF to SAP Pa | SAP to STF 10−6 °C-m/s | SSF to SAP Pa | SAP to SSF 10−6 Psu-m/s |
---|---|---|---|---|---|---|
ConvGRU | 8.85 ± 2.91 | 10.32 ± 4.82 | 94 ± 23 | 12.93 ± 5.28 | 98 ± 25 | 12.39 ± 7.38 |
ConvGRU with MDTW | 6.92 ± 2.19 | 85 ± 17 | 11.42 ± 3.97 | 99 ± 21 | 10.43 ± 5.85 | 11.86 ± 7.27 |
CAFE-GRUS | 5.43 ± 1.94 | 78 ± 14 | 9.72 ± 3.83 | 96 ± 19 | 9.32 ± 5.28 | 10.53 ± 6.11 |
CAFE-GRUS with MDTW | 3.43 ± 0.91 | 5.57 ± 1.35 | 55 ± 6 | 6.20 ± 3.97 | 68 ± 8 | 7.93 ± 4.82 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, C.; Wang, H.; Su, Q.; Ning, C.; Li, T. A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series. Remote Sens. 2023, 15, 3552. https://doi.org/10.3390/rs15143552
Li C, Wang H, Su Q, Ning C, Li T. A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series. Remote Sensing. 2023; 15(14):3552. https://doi.org/10.3390/rs15143552
Chicago/Turabian StyleLi, Chao, Haoran Wang, Qinglei Su, Chunlin Ning, and Teng Li. 2023. "A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series" Remote Sensing 15, no. 14: 3552. https://doi.org/10.3390/rs15143552
APA StyleLi, C., Wang, H., Su, Q., Ning, C., & Li, T. (2023). A Convolution and Attention Neural Network with MDTW Loss for Cross-Variable Reconstruction of Remote Sensing Image Series. Remote Sensing, 15(14), 3552. https://doi.org/10.3390/rs15143552