# Missing Pixel Reconstruction on Landsat 8 Analysis Ready Data Land Surface Temperature Image Patches Using Source-Augmented Partial Convolution

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

^{1}− l

^{2}norm regularization [8,9], and exemplar-based inpainting [10]); (ii) spectral-based methods that utilize the redundant spectral information in multispectral and hyperspectral images (e.g., [9]); (iii) temporal-based methods that use the temporal dependency information among images (e.g., temporal replacement [11], temporal filter [12], and temporal learning model [13]); and (iv) spatial-temporal-spectral-based methods (e.g., spatial completion with temporal guidance [14] and joint spectral-temporal methods [15]). In addition, Sun et al. (2018) [16] proposed a special spectral method, graph Fourier transform of matrix network, to recover the entire matrix network from incomplete observations.

## 2. Materials and Methods

#### 2.1. Model Architecture

#### 2.1.1. Overview

#### 2.1.2. Encoder

#### 2.1.3. Decoder

#### 2.1.4. Partial Convolution Layer

**X**and

**M**denote the input image and mask slices in the sliding window.

**W**

_{j}and b

_{j}are the convolution weights and bias (for the jth output feature). x

_{j}’ and m

_{j}’ are the PConv’s (jth) output value and the corresponding updated mask value. Figure 4 gives the flowchart of the PConv layer in SAPC2.

#### 2.1.5. Partial Merge Layer and Merge Layer

**M**with the merge weight ${t}_{M}$ in Equation (4) and ${t}_{M}$ is defined by

**M**. Other notations in Equations (4)–(6) are the same as in PConv. Figure 5 presents the flowchart of the PMerge layer.

#### 2.1.6. Linear Convolution Layer and Final Mean and Standard Deviation Adjustment Layer

#### 2.2. Baseline Models

#### 2.2.1. SAPC1

- (a)
- the initial skip connection for the last decoder is the source image instead of PMerge of the source and target;
- (b)
- the first encoder does not have batch normalization after PConv;
- (c)
- the first four encoders’ PConv s use ReLU as activations;
- (d)
- the last (5th) encoder is implemented as PConv on the previous encoder’s target output image and there is no batch normalization or activation before passing its output to the first decoder;
- (e)
- the first four decoders’ (i.e., Decoders 4, 3, 2, and 1 in Figure 1) activations are LeakyReLU;
- (f)
- the last decoder does not have batch normalization nor activation; and
- (g)
- no unmasked mean square error (MSE) losses imposed between the encoder and decoder pairs (e.g., encoder 1′s target input vs. decoder 1′s output) to encourage the matching levels have similar abstract features.

#### 2.2.2. SAPC2-Original Partial Convolution (SAPC2-OPC)

#### 2.2.3. SAPC2-Standard Convolution (SAPC2-SC)

#### 2.2.4. STS-CNN

- (a)
- There is almost no change in number of features (i.e., 60) across the STS-CNN framework except in the first and last convolution steps. The spatial size is constant for the entire network. In contrast, SAPC2 is based on U-Net framework with intended feature and spatial size variations.
- (b)
- STS-CNN utilizes multiscale convolution to extract more features for the multi-context information and dilated convolution to enlarge the receptive field while maintaining the minimal kernel size.
- (c)
- STS-CNN uses standard convolution instead of partial convolution in the framework.
- (d)
- STS-CNN uses ReLU for all activations.

- (e)
- STS-CNN takes the satellite images as inputs but does not utilize date related inputs as in SAPC2 (i.e., day of year and date difference from the target acquisition date). To take into account the seasonal LST variations, each sample’s source and target images are independently scaled to the same range [0,1].
- (f)
- STS-CNN aims to have fast convergence and chooses to optimize the MSE of the residual map, which is close to (but not the same as) the masked MSE used in SAPC2. As a result, many metric/loss functions for SAPC2 that are dependent on the unmasked part or on the feature domains are not provided.

#### 2.3. Training Procedure

#### 2.3.1. Number of Trainable Variables

#### 2.3.2. Loss Functions

**I**

_{out}is the raw model prediction (i.e., before replacing the unmasked part with the target image).

**I**

_{gt}is the ground truth (i.e., complete target image).

**M**is the mask in which value 1 indicates unmasked (good) pixels and value 0 indicates masked pixels. CC() is the correlation coefficient function. Sobel() is the function that returns the two (i.e., horizontal and vertical) Sobel edge maps.

_{recovery}) that measure the MSEs between the encoders’ target inputs and the corresponding decoders’ outputs in the unmasked regions to encourage the recovery of similar features at the matching encoder–decoder levels. These loss terms do not include the Encoder 1–Decoder 1 level.

_{reg,l2}.

#### 2.3.3. Learning Rate Scheduling

_{nep}) between the initial minimum learning rate (lr

_{min,init}) and the final minimum learning rate (lr

_{min,nep}) at a given epochs (n

_{ep}). With these two parameters, and assuming a constant decay rate between any two adjacent epochs, the decay rate per epoch (d

_{ep}) is expressed by the following equation:

_{max,iep}) and minimum (lr

_{min,iep}) learning rates and their range (lr

_{range,iep}) at the ith epoch (iep) are expressed by:

_{max,init}is the initial maximum learning rate. For the DNN in this study, it was found that lr

_{max,init}three times larger than lr

_{min,init}showed the best model performance.

_{min,iep}and lr

_{range,iep}, the remaining steps to derive the per step learning rate is the same as other cyclical scheduling methods. For a step p within the epoch iep, the learning rate at p (lr

_{p,iep}) is derived from the following equations:

#### 2.4. Implementation

#### 2.4.1. Software and Hardware

_{max,init}of 7.05 × 10

^{−4}, and lr

_{min,init}of 2.35 × 10

^{−4}for all five models.

#### 2.4.2. Hyperparameter Tuning

_{min,init}and lr

_{max,init}) and the weights for the six loss terms in Equation(13), are tuned using the grid search with approximately 21% of the full training dataset. Since the weights for the total loss may not be available at the time of hyperparameter tuning, the objective function for tuning is the average of masked and unmasked MSE.

#### 2.5. Dataset

## 3. Results

#### 3.1. Training Process

#### 3.2. Validation Statistics

- MSE
_{masked}: The Mean Square Error (MSE) between the model prediction and ground truth in the masked part - L
_{sobel,masked}: The source-target-correlation-coefficient-weighted MSE between the Sobel-edge transformed model prediction and the corresponding transformed ground truth in the masked part - MSE
_{unmasked}: The MSE between the model prediction and ground truth in the unmasked part - MSE
_{sobel,unmasked}: The MSE between the Sobel-edge transformed model prediction and the corresponding transformed ground truth in the unmasked part - MSE
_{weighted}: The weighted sum of MSE_{masked}, L_{sobel,masked}, MSE_{unmasked}, and MSE_{sobel,unmasked}, with the weights given in Equation (13) - MSE
_{mosaic}: The MSE between the mosaicked model prediction and ground truth - CC
_{mosaic}: The Correlation Coefficient between the mosaicked model prediction and ground truth - PSNR
_{mosaic}: The Peak Signal-To-Noise ratio of the mosaicked model prediction - SSIM
_{mosaic}: The Structural Similarity Index of the mosaicked model prediction

_{mosaic}, PSNR

_{mosaic}, and SSIM

_{mosaic}). SAPC2 also has the lowest standard deviation values on all metrics except for PSNR

_{mosaic}, on which STS-CNN has the lowest value (4.33).

_{masked}. The SAPC2’s PDFs are displayed as the blue shapes in the four subplots, while the orange shapes represent the four baselines’ PDFs. It is seen that the STS-CNN has the most distinguishable PDF than SAPC2 (Figure 8a). The SAPC2’s left end is closer to zero, its peak is higher and has a smaller MSE value, and its right tail is much shorter than STS-CNN. Although the similar patterns are also seen on SAPC2-SC (Figure 8b), and SAPC2-OPC (Figure 8c), their PDFs are closer to matching SAPC2’s. The PDF of SAPC1 (Figure 8d) is almost identical to that of SAPC2, but from the left end to MSE

_{masked}~= 0.025, SAPC2 (the blue bars) has slightly higher PDF values. All other metrics’ PDF comparison shows similar distinguishable results (data not shown).

#### 3.3. Case Study

_{masked}and L

_{sobel,masked}values among the five models (i.e., 0.00183 and 0.00268). They are 67% and 46% lower than those of SAPC1 (0.00560 and 0.00494), 55% and 32% lower than those of SAPC2-OPC (0.00403 and 0.00394), 83% and 59% lower than those of SAPC2-SC (0.01070 and 0.00647), and 59% and 98% lower than those of STS-CNN (0.00449 and 0.13682). According to these metrics, SAPC2 is the best model for the case. Along the left edge of the mask, Figure 9d–g shows some significant edge artifacts, where STS-CNN exhibits some random noisy pixels and the three SAPC-based models show smooth pattern but abrupt color shift near the edge area. The details in the masked regions are not well recovered by all methods due to the source image’s quality; however, the overall color scale of SAPC2 in the masked part is closer to the ground truth than the other methods.

_{masked}and L

_{sobel,masked}for SPAC2 are 0.00075 and 0.00267, respectively. They are 53% and 14% lower than those of SPAC1 (0.00161 and 0.00312), 44% and 32% lower than those of SPAC2-OPC (0.00134 and 0.00392), 91% and 65% lower than those of SPAC2-SC (0.00843 and 0.00761), and 67% and 98% lower than those of STS-CNN (0.00227 and 0.11243). According to these metrics, SAPC2 is the best model for Case 2.

_{masked}and L

_{sobel,masked}for SPAC2 are 0.00126 and 0.00917, respectively. They are 39% and 16% lower than those of SPAC1 (0.00207 and 0.01098), 40% and 13% lower than those of SPAC2-OPC (0.00211 and 0.01057), 75% and 23% lower than those of SPAC2-SC (0.00508 and 0.01194), and 69% and 86% lower than those of STS-CNN (0.00410 and 0.06764). According to these metrics, SAPC2 is the best model for Case 3.

_{masked}and L

_{sobel,masked}for SPAC2 are 0.00106 and 0.00657, respectively. They are 36% and 9% lower than those of SPAC1 (0.00166 and 0.00722), 55% and 20% lower than those of SPAC2-OPC (0.00236 and 0.00824), 80% and 30% lower than those of SPAC2-SC (0.00528 and 0.00941), and 43% and 83% lower than those of STS-CNN (0.00187 and 0.03858). According to these metrics, SAPC2 is the best model for Case 4.

_{masked}and L

_{sobel,masked}for SPAC2 are 0.00277 and 0.01194, respectively. They are 68% and 49% lower than those of SPAC1 (0.00872 and 0.02335), 39% and 21% lower than those of SPAC2-OPC (0.00457 and 0.01516), 65% and 31% lower than those of SPAC2-SC (0.00787 and 0.01729), and 85% and 96% lower than those of STS-CNN (0.01866 and 0.31452). According to these metrics, SAPC2 is the best model for Case 5.

_{masked}and L

_{sobel,masked}for SPAC2 are 0.00233 and 0.00536, respectively. They are 53% and 35% lower than those of SPAC1 (0.00499 and 0.00824), 40% and 32% lower than those of SPAC2-OPC (0.00386 and 0.00783), 63% and 35% lower than those of SPAC2-SC (0.00632 and 0.00821), and 59% and 92% lower than those of STS-CNN (0.00565 and 0.06605). According to these metrics, SAPC2 is the best model for Case 6.

## 4. Discussion

#### 4.1. SAPC2 Versus STS-CNN

_{masked}= 0.03728 for non-scaled versus 0.00769 for the scaled). For LST images, even with the standardization on the entire dataset as described in Section 2.5, because of the seasonal LST variation, the distributions for the target and the source in one sample could still be far away from each other even they are highly correlated. Such distribution distance between the two images can be much smaller after the zero-one scaling, reducing the burden of the model for matching the means between the distributions. This may also explain why the significant artifacts are not an issue for large masks with zero-one scaling (e.g., on cases presented in the previous section). With zero-one scaling, the STS-CNN model only needs to look for fine adjustment to the source image in the masked region (evidence: STS-CNN has a deep skip connection that allow the source image information bypass most of the network before output), zero values in the mask rather than information diffused from the target will have limited influence on the model prediction.

_{masked}is 0.00566 for SPAC2-SC while that of STS-CNN is 0.00769. If the reasoning above DOY and DOY difference on improving the nonlinear matching of distributions is correct, it suggests that considering other ancillary data that have nonlinear influence on LST change over time as the model inputs may further improve the model performance.

#### 4.2. SAPC2 Versus SAPC2-OPC and SAPC2-SC

**1**)/sum(

**M**), the mask-only correction ratio, which is used in the original partial convolution [32]; and (3) r = sum(|

**W**|·

**1**)/sum(|

**W**|·

**M**), the absolute weight adjusted correction ratio, which is used in the improved partial convolution [33]. Mathematical proof of the effect of these correction ratios need further study (e.g., in terms of probability). Instead, their empirical effects on the LST missing pixel reconstruction are examined. Specifically, the three implementations of r in PConv correspond to the three model versions in this study: r = 1 for SAPC2-SC, r = sum(

**1**)/sum(

**M**) for SAPC2-OPC, and r = sum(|

**W**|·

**1**)/sum(|

**W**|·

**M**) for SAPC2. Note similar changes of correction ratios in the PMerge and Merge layers are also implemented accordingly. All other aspects of the three models including data, optimization, hyperparameters, software and hardware environments, etc. are the same. The number of trainable parameters is the same for the three models (~ 2.76 million), but the training time per epoch is (13%) longer for SAPC2 and SAPC2-OPC (both are 524 s) than for SAPC2-SC (463 s). After optimization, the three models show significantly different performance (Table 1). For example, the mean validation MSE

_{masked}for SAPC2 (0.00317) is 20% and 44% smaller than that for SAPC2-OPC (0.00398) and SAPC2-SC (0.00566); the standard deviation of validation MSE

_{masked}for SAPC2 (0.00645) is also 8% and 23% smaller than that for SAPC2-OPC (0.00700) and SAPC2-SC (0.00832). Comparison of MSE

_{masked}PDFs between SAPC2 and the other models (Figure 8b,c) also corroborates the findings on the mean and standard deviation of MSE

_{masked}. These results suggest that, for the LST missing pixel reconstruction problem, both versions of partial convolution are superior to the standard convolution, and partial convolution with the absolute weight adjusted correction ratio may generate better results than the original partial convolution with the mask-only correction ratio.

#### 4.3. SAPC2 Versus SAPC1

_{sobel,masked}for SAPC2 compared with those of SAPC1 suggest that the improvement is statistically important. Since a lower encoder/decoder layer usually contains more spatial details than the higher layers, it is speculated that changing the skip connection from the source alone (implemented in SAPC1) to the PMerge of the source and the target (adopted in SAPC2) might be the main contributor to this improvement. In contrast, the improvement of SAPC2 over SAPC1 on MSE

_{masked}may be more of a combined effects of all architecture differences between the two models as well as the additional loss term (L

_{recovery}in Equation (13)) newly introduced in SAPC2. In addition, the performance improvement of SAPC2 over SAPC1 in the unmasked region (in terms of both MSE

_{unmasked}and MSE

_{sobel,unmasked}) is greater than in the masked region, which may indirectly improve the transition between the masked and unmasked regions in the model mosaicked prediction.

## 5. Conclusions

_{masked}of SAPC2 is 7%, 20%, 44%, and 59% lower than that of SAPC1, SAPC2-OPC, SAPC2-SC, and STS-CNN, respectively, and the L

_{sobel,masked}of SAPC2 is 4%, 28%, and 29% lower than that of SAPC1, SAPC2-OPC, and SAPC2-SC, respectively. The results also show that on selected validations cases, the repaired target image generated by SAPC2 have the fewest artifacts near the mask boundary and the best recovery of color scales and fine textures compared with the four baseline models.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Shen, H.; Li, X.; Cheng, Q.; Zeng, C.; Yang, G.; Li, H.; Zhang, L. Missing Information Reconstruction of Remote Sensing Data: A Technical Review. IEEE Geosci. Remote Sens. Mag.
**2015**, 3, 61–85. [Google Scholar] [CrossRef] - King, M.D.; Ackerman, S.; Hubanks, P.A.; Platnick, S.; Menzel, W.P. Spatial and Temporal Distribution of Clouds Observed by MODIS Onboard the Terra and Aqua Satellites. IEEE Trans. Geosci. Remote Sens.
**2013**, 51, 3826–3852. [Google Scholar] [CrossRef] - Ottle, C.; Vidal-Madjar, D. Estimation of Land Surface Temperature with NOAA9 data. Remote Sens. Environ.
**1992**, 40, 27–41. [Google Scholar] [CrossRef] - Quattrochi, D.; Luvall, J.C. Thermal Infrared Remote Sensing for Analysis of Landscape Ecological Processes: Methods and Applications. Landsc. Ecol.
**1999**, 14, 577–598. [Google Scholar] [CrossRef] - Cook, M.; Schott, J.R.; Mandel, J.; Raqueno, N. Development of an Operational Calibration Methodology for the Landsat Thermal Data Archive and Initial Testing of the Atmospheric Compensation Component of a Land Surface Temperature (LST) Product from the Archive. Remote Sens.
**2014**, 6, 11244–11266. [Google Scholar] [CrossRef][Green Version] - Zhang, C.; Li, W.; Travis, D. Gaps-fill of SLC-off Landsat ETM+ Satellite Image Using a Geostatistical Approach. Int. J. Remote Sens.
**2007**, 28, 5103–5122. [Google Scholar] [CrossRef] - Caselles, V. Variational models for image inpainting. In Proceedings of the European Congress of Mathematics, Kraków, Poland, 2–7 July 2012; European Mathematical Society Publishing House: Berlin, Germany, 2013; pp. 227–242. [Google Scholar]
- Shen, H.; Zhang, L. A MAP-Based Algorithm for Destriping and Inpainting of Remotely Sensed Images. IEEE Trans. Geosci. Remote Sens.
**2008**, 47, 1492–1502. [Google Scholar] [CrossRef] - Rakwatin, P.; Takeuchi, W.; Yasuoka, Y. Restoration of Aqua MODIS Band 6 Using Histogram Matching and Local Least Squares Fitting. IEEE Trans. Geosci. Remote Sens.
**2008**, 47, 613–627. [Google Scholar] [CrossRef] - Criminisi, A.; Perez, P.; Toyama, K. Region Filling and Object Removal by Exemplar-Based Image Inpainting. IEEE Trans. Image Process.
**2004**, 13, 1200–1212. [Google Scholar] [CrossRef] - Holben, B.N. Characteristics of Maximum-Value Composite Images from Temporal AVHRR Data. Int. J. Remote Sens.
**1986**, 7, 1417–1434. [Google Scholar] [CrossRef] - Beck, P.S.A.; Atzberger, C.; Høgda, K.A.; Johansen, B.; Skidmore, A.K. Improved Monitoring of Vegetation Dynamics at Very High Latitudes: A New Method Using MODIS NDVI. Remote Sens. Environ.
**2006**, 100, 321–334. [Google Scholar] [CrossRef] - Abdellatif, B.; Lecerf, R.; Mercier, G.; Hubert-Moy, L. Preprocessing of Low-Resolution Time Series Contaminated by Clouds and Shadows. IEEE Trans. Geosci. Remote Sens.
**2008**, 46, 2083–2096. [Google Scholar] [CrossRef] - Cheng, Q.; Shen, H.; Zhang, L.; Yuan, Q.; Zeng, C. Cloud Removal for Remotely Sensed Images by Similar Pixel Replacement Guided with a Spatio-Temporal MRF Model. ISPRS J. Photogramm. Remote Sens.
**2014**, 92, 54–68. [Google Scholar] [CrossRef] - Li, X.; Shen, H.; Zhang, L.; Li, H. Sparse-Based Reconstruction of Missing Information in Remote Sensing Images from Spectral/Temporal Complementary Information. ISPRS J. Photogramm. Remote Sens.
**2015**, 106, 1–15. [Google Scholar] [CrossRef] - Sun, Q.; Yan, M.; Donoho, D.; Boyd, S. Convolutional Imputation of Matrix Networks. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Jennifer, D., Andreas, K., Eds.; Stockholmsmässan: Stockholm, Sweden, 2018; Volume 80, pp. 4818–4827. [Google Scholar]
- Quan, J.; Zhan, W.; Chen, Y.; Wang, M. Time Series Decomposition of Remotely Sensed Land Surface Temperature and Investigation of Trends and Seasonal Variations in Surface Urban Heat Islands. J. Geophys. Res. Atmos.
**2016**, 121, 2638–2657. [Google Scholar] [CrossRef] - LeCun, Y.; Boser, B.; Denker, J.S.; Howard, R.E.; Habbard, W.; Jackel, L.D.; Henderson, D. Handwritten Digit Recognition with a Back-Propagation Network. In Advances in Neural Information Processing Systems; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 1990; pp. 396–404. ISBN 1558601007. [Google Scholar]
- Shao, Z.; Pan, Y.; Diao, C.; Cai, J. Cloud Detection in Remote Sensing Images Based on Multiscale Features-Convolutional Neural Network. IEEE Trans. Geosci. Remote. Sens.
**2019**, 57, 40624076. [Google Scholar] [CrossRef] - Lei, S.; Shi, Z.; Zou, Z. Super-Resolution for Remote Sensing Images via Local–Global Combined Network. IEEE Geosci. Remote. Sens. Lett.
**2017**, 14, 1243–1247. [Google Scholar] [CrossRef] - Gu, J.; Sun, X.; Zhang, Y.; Fu, K.; Wang, L. Deep Residual Squeeze and Excitation Network for Remote Sensing Image Super-Resolution. Remote. Sens.
**2019**, 11, 1817. [Google Scholar] [CrossRef][Green Version] - Zhang, D.; Shao, J.; Li, X.; Shen, H.T. Remote Sensing Image Super-Resolution via Mixed High-Order Attention Network. IEEE Trans. Geosci. Remote. Sens.
**2020**, 1–14. [Google Scholar] [CrossRef] - Xie, W.; Li, Y. Hyperspectral Imagery Denoising by Deep Learning with Trainable Nonlinearity Function. IEEE Geosci. Remote. Sens. Lett.
**2017**, 14, 1963–1967. [Google Scholar] [CrossRef] - Liu, W.; Lee, J. A 3-D Atrous Convolution Neural Network for Hyperspectral Image Denoising. IEEE Trans. Geosci. Remote. Sens.
**2019**, 57, 5701–5715. [Google Scholar] [CrossRef] - Wei, K.; Fu, Y.; Huang, H. 3-D Quasi-Recurrent Neural Network for Hyperspectral Image Denoising. IEEE Trans. Neural Networks Learn. Syst.
**2020**, 1–13. [Google Scholar] [CrossRef][Green Version] - Li, Y.; Liu, S.; Yang, J.; Yang, M.-H. Generative Face Completion. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5892–5900. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Yan, Z.; Li, X.; Li, M.; Zuo, W.; Shan, S. Shift-Net: Image Inpainting via Deep Feature Rearrangement. In Proceedings of the Computer Vision; Springer Science and Business Media LLC: Munich, Germany, October 2018; pp. 3–19. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2017; pp. 105–114. [Google Scholar]
- Mou, L.; Bruzzone, L.; Zhu, X.X. Learning Spectral-Spatial-Temporal Features via a Recurrent Convolutional Neural Network for Change Detection in Multispectral Imagery. IEEE Trans. Geosci. Remote. Sens.
**2018**, 57, 924–935. [Google Scholar] [CrossRef][Green Version] - Yuan, Q.; Yuan, Q.; Zeng, C.; Li, X.; Wei, Y. Missing Data Reconstruction in Remote Sensing Image with a Unified Spatial-Temporal-Spectral Deep Convolutional Neural Network. IEEE Trans. Geosci. Remote. Sens.
**2018**, 56, 4274–4288. [Google Scholar] [CrossRef][Green Version] - Liu, G.; Reda, F.A.; Shih, K.J.; Wang, T.-C.; Tao, A.; Catanzaro, B. Image Inpainting for Irregular Holes Using Partial Convolutions. In Proceedings of the Haptics: Science, Technology, Applications, Munich, Germany, 8–14 September 2018; Springer Science and Business Media LLC: Cham, Switzerland, 2018; pp. 89–105. [Google Scholar]
- Chen, M.; Newell, B.H.; Sun, Z.; Corr, C.A.; Gao, W. Reconstruct Missing Pixels of Landsat Land Surface Temperature Product Using a CNN with Partial Convolution. Appl. Mach. Learn.
**2019**, 11139, 111390E. [Google Scholar] [CrossRef] - Smith, L.N. Cyclical Learning Rates for Training Neural Networks. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–26 March 2017; pp. 464–472. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale Machine Learning on Heterogeneous Systems. arXiv
**2016**, arXiv:1603.044672015. [Google Scholar]

**Figure 1.**The overall architecture of the proposed Source Augmented Partial Convolution v2 (SAPC2) model. The colors and shapes denote the origin and/or the purpose of the symbols. The grey boxes/circles indicate operators (including custom layers). The arrows show the data flow of SAPC2; the horizontal lines and non-grey rectangles indicate data. The blue rectangles highlight data on the target path. Similarly, the green, purple, and black rectangles are data associated with the source path, the skip connection, and the decoder path, respectively, and the red line highlights the model output (i.e., the raw repaired target image). The dotted lines are masks and imply the associated image has missing pixels. The spatial size and number of features are either labeled near the data symbols or placed on the left side of the figure.

**Figure 2.**Structure of a SAPC2 Encoder. “T” and “S” represent target and source images, respectively. The dotted line denotes the mask of the target image. The two Partial Convolution layers share the same weights. For the target image, if there are remaining missing pixels after Partial Convolution, before batch normalization (BN), the missing part is temporarily filled with random values sampled from a normal distribution with the mean and standard deviation derived from the unmasked part.

**Figure 4.**The flowchart of a SAPC2 Partial Convolution layer (PConv) (

**a**) and the PConv Correction Ratio submodule (

**b**). Hollow black squares represent data nodes and solid black squares represent trainable weights used in convolutions. Grey symbols denote operators, where circles are pointwise operators, squares are standard 2D convolutions, and rounded rectangles are custom functions/layers. For example, the grey circle with “1” inside denotes an operator that sets all values of the input image to 1; and the grey circle with “|·|” inside denotes an element-wise absolute operator. Symbols B, H, W, and F

_{in}represent the input image’s batch size, height, width, and number of features, respectively. Symbols k

_{H}, k

_{W}, S, and F

_{out}are the convolutions’ kernel width, kernel height, stride length, and number of output features. The two convolutions in (

**b**) that derive the numerator and the denominator share the same kernel, |W

_{PConv}|. The implementation of the mask updating step (i.e., Conv2D 2 and Clip to [0,1] (

**b**) for Equation (2) originated from the following source: https://github.com/MathiasGruber/PConv-Keras.

**Figure 5.**The flowchart of a SAPC2 Partial Merge layer (PMerge) (

**a**); the Merge Weight submodule (

**b**); and the PMerge Correction Ratio submodule (

**c**).

**Figure 6.**The flowchart of a SAPC2 Merge layer (Merge) (

**a**); and the Merge Correction Ratio submodule (

**b**).

**Figure 8.**Comparison of Probability Density Functions (PDFs) of MSE

_{masked}between SAPC2 and STS-CNN (

**a**), SAPC2-SC (

**b**), SAPC2-OPC (

**c**), and SAPC1 (

**d**) on the entire validation dataset. Note that the bin size of the PDFs is small (i.e., 0.0002). As a result, the PDF values at some bins may be high, but the sum of the bin-size weighted PDF values in the full range of MSE

_{masked}still equals 1.

**Figure 9.**Model predictions comparison, Case 1: (

**a**) the input source image (acquired on day of year (DOY) 228); (

**b**) the masked target image (acquired on DOY 276) with the black color highlights the masked part; (

**c**) the ground truth (complete target) image (also acquired on DOY 276); (

**d**–

**h**) mosaicked model predictions for the four baseline models (i.e., left to right: STS-CNN, SAPC2-SC, SAPC2-OPC, and SAPC1) and for the proposed model (i.e., SAPC2). All model predictions share the same color bar as that of the ground truth (

**c**). Note that the color bars are independent between the source and the target (or the ground truth). The masked fraction of the target image (

**b**) is 0.476. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.957.

**Figure 10.**The same as Figure 9 but for Case 2. The acquisition DOYs for source and target inputs are 251 and 267, respectively. The masked fraction of the target image (

**b**) is 0.531. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.975.

**Figure 11.**The same as Figure 9 but for Case 3 (acquisition DOYs for source and target inputs are 258 and 210, respectively). The masked fraction of the target image (

**b**) is 0.448. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.955.

**Figure 12.**The same as Figure 9 but for Case 4 (acquisition DOYs for source and target inputs are 258 and 242, respectively). The masked fraction of the target image (

**b**) is 0.549. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.968.

**Figure 13.**The same as Figure 9 but for Case 5 (acquisition DOYs for source and target inputs are 212 and 260, respectively). The masked fraction of the target image (

**b**) is 0.476. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.915.

**Figure 14.**The same as Figure 9 but for Case 6 (acquisition DOYs for source and target inputs are 208 and 160, respectively). The masked fraction of the target image (

**b**) is 0.407. The correlation coefficient between source (

**a**) and ground truth (

**c**) images is 0.839.

**Table 1.**Means and standard deviations of nine validation metrics for the proposed model (SAPC2) and the four baseline models (SAPC1, SAPC2-OPC, SAPC2-SC, and STS-CNN). The top value in each box is the mean value of the row’s metric for the column’s model, and the bottom value in the parentheses in each box is the corresponding standard deviation value. The arrows beneath each metric name indicate whether a greater or less value is better for the metric: the arrow on the left side is the indicator for the mean value and the arrow in the parentheses on the right side is the indicator for the standard deviation value. The best values for each metric (i.e., for each row) are bolded. Since STS-CNN returns close-to-zero values in the unmasked part, four metrics (i.e., L

_{sobel,masked}, MSE

_{unmasked}, MSE

_{sobel,unmasked}, and MSE

_{weighted}) is not available (N/A) for STS-CNN.

Metric | SAPC2 | SAPC1 | SAPC2-OPC | SAPC2-SC | STS-CNN |
---|---|---|---|---|---|

MSE_{masked}↓(↓) | 0.00317( 0.00645) | 0.00341 (0.00699) | 0.00398 (0.00700) | 0.00566 (0.00832) | 0.00769 (0.01198) |

L_{sobel,masked}↓(↓) | 0.01156( 0.01430) | 0.01203 (0.01531) | 0.01608 (0.01641) | 0.01600 (0.01734) | N/A (N/A) |

MSE_{unmasked}↓(↓) | 0.00025( 0.00023) | 0.00033 (0.00030) | 0.00072 (0.00050) | 0.00063 (0.00050) | N/A (N/A) |

MSE_{sobel,unmasked}↓(↓) | 0.00221( 0.00188) | 0.00273 (0.00221) | 0.00803 (0.00565) | 0.00482 (0.00342) | N/A (N/A) |

MSE_{weighted}↓(↓) | 0.01793( 0.02576) | 0.01913 (0.02786) | 0.02635 (0.02950) | 0.02852 (0.03249) | N/A (N/A) |

MSE_{mosaic}↓(↓) | 0.00098( 0.00232) | 0.00105 (0.00250) | 0.00121 (0.00251) | 0.00168 (0.00290) | 0.00228 (0.00444) |

CC_{mosaic}↑(↓) | 0.986( 0.019) | 0.985 ( 0.019) | 0.981 (0.022) | 0.973 (0.034) | 0.955 (0.074) |

PSNR_{mosaic}↑(↓) | 47.55(5.34) | 47.33 (5.42) | 46.12 (4.92) | 44.21 (4.59) | 42.85 ( 4.33) |

SSIM_{mosaic}↑(↓) | 0.989( 0.020) | 0.989 (0.021) | 0.986 (0.022) | 0.983 (0.025) | 0.971 (0.031) |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chen, M.; Sun, Z.; Newell, B.H.; Corr, C.A.; Gao, W. Missing Pixel Reconstruction on Landsat 8 Analysis Ready Data Land Surface Temperature Image Patches Using Source-Augmented Partial Convolution. *Remote Sens.* **2020**, *12*, 3143.
https://doi.org/10.3390/rs12193143

**AMA Style**

Chen M, Sun Z, Newell BH, Corr CA, Gao W. Missing Pixel Reconstruction on Landsat 8 Analysis Ready Data Land Surface Temperature Image Patches Using Source-Augmented Partial Convolution. *Remote Sensing*. 2020; 12(19):3143.
https://doi.org/10.3390/rs12193143

**Chicago/Turabian Style**

Chen, Maosi, Zhibin Sun, Benjamin H. Newell, Chelsea A. Corr, and Wei Gao. 2020. "Missing Pixel Reconstruction on Landsat 8 Analysis Ready Data Land Surface Temperature Image Patches Using Source-Augmented Partial Convolution" *Remote Sensing* 12, no. 19: 3143.
https://doi.org/10.3390/rs12193143