# Convolutional Neural Network for Pansharpening with Spatial Structure Enhancement Operator

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- Loss of spectral information. For example, in CS-based methods, IHS methods [6] aim to convert the original MS image from RGB space to IHS space, and then replace an intensity (I) component with high-resolution images or images to be fused that are obtained by different projection methods. In the IHS space, the correlation of the three components is relatively low, which enables us to deal with the three components separately. The I component mainly reflects the total radiation energy of ground features and its spatial distribution, which is manifested as spatial characteristics, whereas H and S reflect spectral information. For methods of PCA methods [7], in contrast to the IHS transform method, the image can be decomposed into multiple components. The first principal component containing contour information can be replaced, which effectively improves the spatial resolution of multispectral images. However, the low correlation between the replacement components also causes distortion of spectral bands.
- Artifacts in the spatial structure. Among the MRA-based methods, a pansharpening method based on multiresolution DWT [9] is becoming increasingly popular due to its superior spectral retention ability. Nevertheless, because of the secondary subsampling in wavelet decomposition, artifacts usually exist in fusion results.
- Reliance on prior knowledge. Among some methods that rely on prior model, the most core issue is that the solution of the model is heavily dependent on prior knowledge. Moreover, the construction of the energy functional treats the degradation process as a linear transformation. However, in actual scenes, the degradation from an HR image to an LR image is a nonlinear process, with the superposition of various noises and disturbances.

- We proposed an original pansharpening method that combines an edge-detection operator with a CNN to boost higher performance of edge feature extraction. The method that we came up with achieved preferable performance in simulated experiments and real-data experiments.
- We raised a three-layer CNN that can acquire spatial feature maps while preserving the spatial and spectral domain information.
- We designed a spatial structure enhancement operator that uses the Sobel operator as an edge-detection operator to gain high-frequency knowledge of the original PAN and LRMS images, hence obtaining the spatial features of the images.

## 2. Background

#### 2.1. CNN Model

#### 2.2. Pansharpening Based on CNN

#### 2.3. Edge Detection Operator

## 3. The Proposed Methods

#### 3.1. Pansharpening Methods

**B**denotes the total bands of the LRMS image.

**B**represents the total bands of the LRMS image.

#### 3.2. Network of Feature Extraction

#### 3.3. Spatial Structure Enhancement

#### 3.4. Optimization Method

Algorithm 1. Spatial structure enhancement convolution operator pansharpening algorithm. |

Input: LRMS image ${\mathit{X}}_{MS}$;PAN image ${\mathit{X}}_{PAN}$ Output: HRMS image ${\mathit{G}}_{MS}$1: for each $patc{h}_{i,j}\in {\mathit{X}}_{MS}$, ${\mathit{X}}_{PAN}$ do2: Extract feature maps ${S}^{\prime}\left({\mathit{X}}_{MS}\right)$ of $patc{h}_{i}$ by Sobel operator 3: Extract feature maps $S\left({\mathit{X}}_{PAN}\right)$ of $patc{h}_{j}$ by Sobel operator 4: Up-sampling ${S}^{\prime}\left({\mathit{X}}_{MS}\right)$ four times by bicubic interpolation, generate the feature maps $S\left({\mathit{X}}_{MS}\right)$ 5: Combine feature maps generated after Step 3 and Step 4, generate the final feature maps $S\left({\mathit{X}}_{MS},{\mathit{X}}_{PAN}\right)$ 6: Filter $\psi \left(S\left({\mathit{X}}_{MS},{\mathit{X}}_{PAN}\right)\right)$ the feature maps $S\left({\mathit{X}}_{MS},{\mathit{X}}_{PAN}\right)$ by three-layer convolutional neural network 7: Fuse the pre-interpolated LRMS with the final feature maps, generate the high- resolution image ${\mathit{G}}_{MS}$ after fusion 8: Evaluate the $LOSS$ of the ground truth and the fused image ${\mathit{G}}_{MS}$ 9: if $LOSS<\tau $10: Return to Step 2 11: end if12: end for |

## 4. Experiments and Discussion

#### 4.1. Experimental Settings

#### 4.1.1. Datasets

#### 4.1.2. Parameters and Training

^{−3}. Then we multiplied by 10

^{−1}at 10

^{5}, 2 × 10

^{5}iterations, and finally, we terminated the process at 3.0 × 10

^{5}iterations. In the spatial structure enhancement module, we set the convolution sizes to ${s}_{1}={s}_{2}={s}_{3}=3$ and the convolution numbers to ${a}_{1}={a}_{2}=32$.

#### 4.2. Assessments

#### 4.2.1. Quantitative Assessments

_{S}and D

_{λ}) [35].

#### 4.2.2. Visual Assessments

#### 4.3. Experiment 1: Simulated Experiments

#### 4.4. Experiment 2: Real-Data Experiments

_{S}and D

_{λ}were better in the DiCNN method, the proposed method retained similar numerical values. For the QNR index, the method we proposed achieved the best performance. These non-reference evaluation indicators represent the restoration quality of direct fusion in the absence of ground truth, which also indicates that our method can achieve great results in practical situation of directly integrating images.

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- He, C.; Sun, L.; Huang, W.; Zhang, J.; Zheng, Y.; Jeon, B. TSLRLN: Tensor subspace low-rank learning with non-local prior for hyperspectral image mixed denoising. Signal Process.
**2021**, 184, 108060. [Google Scholar] [CrossRef] - Sun, L.; He, C.; Zheng, Y.; Tang, S. SLRL4D: Joint restoration of subspace low-rank learning and non-local 4-D transform filtering for hyperspectral image. Remote Sens.
**2020**, 12, 2979. [Google Scholar] [CrossRef] - He, W.; Yao, Q.; Li, C.; Yokoya, N.; Zhao, Q.; Zhang, H.; Zhang, L. Non-local meets global: An integrated paradigm for hyperspectral image restoration. IEEE Trans. Pattern Anal. Mach. Intell.
**2020**, 1. [Google Scholar] [CrossRef] [PubMed] - Loncan, L.; Almeida, L.; Bioucas-Dias, J.; Briottet, X.; Chanussot, J.; Dobigeon, N.; Fabre, S.; Liao, W.; Licciardi, G.A.; Simoes, M.; et al. Hyperspectral pansharpening: A review. IEEE Geosci. Remote Sens. Mag.
**2015**, 3, 27–46. [Google Scholar] [CrossRef] [Green Version] - Meng, X.; Shen, H.; Li, H.; Zhang, L.; Fu, R. Review of the pansharpening methods for remote sensing images based on the idea of meta-analysis: Practical discussion and challenges. Inf. Fusion
**2019**, 46, 102–113. [Google Scholar] [CrossRef] - Aiazzi, B.; Baronti, S.; Selva, M. Improving component substitution pansharpening through multivariate regression of MS+Pan data. IEEE Trans. Geosci. Remote Sens.
**2007**, 45, 3230–3239. [Google Scholar] [CrossRef] - Zhang, H.; Liu, L.; He, W.; Zhang, L. Hyperspectral image denoising with total variation regularization and nonlocal low-rank tensor decomposition. IEEE Trans. Geosci. Remote Sens.
**2019**, 99, 1–14. [Google Scholar] [CrossRef] - Gillespie, A.R.; Kahle, A.B.; Walker, R.E. Color enhancement of highly correlated images. ii. channel ratio and “chromaticity” transformation techniques. Remote Sens. Environ.
**1987**, 22, 343–365. [Google Scholar] [CrossRef] - Khan, M.M.; Chanussot, J.; Condat, L.; Montanvert, A. Indusion: Fusion of Multispectral and Panchromatic Images Using the Induction Scaling Technique. IEEE Geosci. Remote Sens. Lett.
**2008**, 5, 98–102. [Google Scholar] [CrossRef] [Green Version] - Aiazzi, B.; Alparone, L.; Baronti, S.; Garzelli, A.; Selva, M. MTF-tailored multiscale fusion of high-resolution MS and pan imagery. Photogramm. Eng. Remote Sens.
**2006**, 72, 591–596. [Google Scholar] [CrossRef] - Liu, J. Smoothing filter-based intensity modulation: A spectral preserve image fusion technique for improving spatial details. Int. J. Remote Sens.
**2000**, 21, 3461–3472. [Google Scholar] [CrossRef] - Shi, G.; Luo, F.; Tang, Y.; Li, Y. Dimensionality reduction of hyperspectral image based on local constrained manifold structure collaborative preserving embedding. Remote Sens.
**2021**, 13, 1363. [Google Scholar] [CrossRef] - Duan, Y.; Huang, H.; Wang, T. Semisupervised Feature Extraction of Hyperspectral Image Using Nonlinear Geodesic Sparse Hypergraphs. IEEE Trans. Geosci. Remote Sens.
**2021**, 1–15. [Google Scholar] [CrossRef] - Zhang, L.; Shen, H.; Gong, W.; Zhang, H. Adjustable model-based fusion method for multispectral and panchromatic images. IEEE Trans Syst. Man Cybern. Part. B Cybern.
**2012**, 42, 1693–1704. [Google Scholar] [CrossRef] - Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell.
**2016**, 38, 295–307. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Masi, G.; Cozzolino, D.; Verdoliva, L.; Scarpa, G. Pansharpening by convolutional neural networks. Remote Sens.
**2016**, 8, 594–615. [Google Scholar] [CrossRef] [Green Version] - He, L.; Rao, Y.; Li, J.; Plaza, A.; Zhu, J. Pansharpening via detail injection based convolutional neural networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2019**, 12, 1188–1204. [Google Scholar] [CrossRef] [Green Version] - Lecun, Y.; Bottou, L. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version] - Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv
**2014**, arXiv:1409.1556. [Google Scholar] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Wei, Y.; Yuan, Q.; Shen, H.; Zhang, L. Boosting the Accuracy of Multispectral Image Pansharpening by Learning a Deep Residual Network. IEEE Geosci. Remote Sens. Lett.
**2017**, 14, 1795–1799. [Google Scholar] [CrossRef] [Green Version] - Rosenfeld, A. The Max Roberts Operator is a Hueckel-Type Edge Detector. IEEE Trans. Pattern Anal. Mach. Intell.
**1981**, 3, 101–103. [Google Scholar] [CrossRef] [PubMed] - Kanopoulos, N.; Vasanthavada, N.; Baker, R.L. Design of an image edge detection filter using the Sobel operator. IEEE J. Solid State Circuits
**1988**, 23, 358–367. [Google Scholar] [CrossRef] - Dong, W.; Shisheng, Z. Color Image Recognition Method Based on the Prewitt Operator. In Proceedings of the 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, 12–14 December 2008; pp. 170–173. [Google Scholar]
- Wang, X. Laplacian Operator-Based Edge Detectors. IEEE Trans. Pattern Anal. Mach. Intell.
**2007**, 29, 886–890. [Google Scholar] [CrossRef] [PubMed] - Canny, J. A Computational Approach To Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell.
**1986**, 8, 679–714. [Google Scholar] [CrossRef] [PubMed] - Ballester, C.; Caselles, V.; Igual, L.; Verdera, J. A variational model for p+ xs image fusion. Int. J. Comput. Vis.
**2006**, 69, 43–58. [Google Scholar] [CrossRef] - Hadsell, R.; Chopra, S.; Lecun, Y. Dimensionality Reduction by Learning an Invariant Mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006. [Google Scholar]
- Robbins, H.; Monro, S. A stochastic approximation method. Ann. Math. Stat.
**1951**, 22, 400–407. [Google Scholar] [CrossRef] - Yuhas, R.H.; Goetz, A.F.; Boardman, J.W. Discrimination among semi-arid landscape endmembers using the spectral angle mapper (sam) algorithm. In Proceedings of the 3rd Annual JPL Airborne Geoscience Workshop, Pasadena, CA, USA, 1–5 June 1992; pp. 147–149. [Google Scholar]
- Wald, L. Data Fusion: Definitions and Architectures: Fusion of Images of Different Spatial Resolutions; Les Presses de l’École des Mines: Paris, France, 2002. [Google Scholar]
- Zhou, J.; Civco, D.L.; Silander, J.A. A wavelet transform method to merge Landsat TM and SPOT panchromatic data. Int. J. Remote. Sens.
**1998**, 19, 743–757. [Google Scholar] [CrossRef] - Wang, Z.; Bovik, A.C. A universal image quality index. IEEE Signal Process. Lett.
**2002**, 9, 81–84. [Google Scholar] [CrossRef] - Alparone, L.; Aiazzi, B.; Baronti, S.; Garzelli, A.; Nencini, F.; Selva, M. Multispectral and panchromatic data fusion assessment without reference. Photogramm. Eng. Remote Sens.
**2008**, 74, 193–200. [Google Scholar] [CrossRef] [Green Version] - Yuan, Q.; Wei, Y.; Meng, X.; Shen, H.; Zhang, L. A multiscale and multidepth convolutional neural network for remote sensing imagery pan-sharpening. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2018**, 11, 978–989. [Google Scholar] [CrossRef] [Green Version] - Aiazzi, B.; Alparone, L.; Baronti, S.; Garzelli, A. Context-driven fusion of high spatial and spectral resolution images based on oversampled multiresolution analysis. IEEE Trans. Geosci. Remote Sens.
**2002**, 40, 2300–2312. [Google Scholar] [CrossRef]

**Figure 6.**Simulated results for the Worldview-2 dataset. (

**a**) Ground truth; (

**b**) PCA; (

**c**) SFIM; (

**d**) MTF-GLP; (

**e**) PNN; (

**f**) DiCNN; (

**g**) proposed.

**Figure 7.**Simulated results of the Quickbird dataset. (

**a**) Ground truth; (

**b**) PCA; (

**c**) SFIM; (

**d**) MTF-GLP; (

**e**) PNN; (

**f**) DiCNN; (

**g**) proposed.

**Figure 8.**Real-data results for the Quickbird dataset. (

**a**) Original PAN; (

**b**) PCA; (

**c**) SFIM; (

**d**) MTF-GLP; (

**e**) PNN; (

**f**) DiCNN; (

**g**) proposed.

**Figure 9.**Real-data results for the Worldview-2 dataset. (

**a**) Original PAN; (

**b**) PCA; (

**c**) SFIM; (

**d**) MTF-GLP; (

**e**) PNN; (

**f**) DiCNN; (

**g**) proposed.

ERGAS (↓) | SAM (↓) | SCC (↑) | PSNR (↑) | SSIM (↑) | Time (s) | |
---|---|---|---|---|---|---|

EXP | 5.9835 | 2.8785 | 0.5988 | 20.7926 | 0.7092 | |

PCA | 12.484 | 2.5433 | 0.6486 | 23.9662 | 0.7284 | 0.54 |

SFIM | 5.8498 | 2.5448 | 0.8481 | 24.1771 | 0.7132 | 0.79 |

MTF-GLP | 3.8994 | 2.5426 | 0.9139 | 27.3228 | 0.8021 | 0.47 |

PNN | 3.9079 | 1.5470 | 0.9071 | 28.9181 | 0.8048 | 1.99 |

DiCNN | 3.6055 | 1.5457 | 0.9365 | 29.7285 | 0.8638 | 2.03 |

Proposed | 3.6246 | 1.5455 | 0.9377 | 30.1652 | 0.8679 | 2.45 |

QNR (↑) | D_{S} (↓) | D_{λ} (↓) | |
---|---|---|---|

PCA | 0.6257 | 0.2679 | 0.1454 |

SFIM | 0.8700 | 0.0523 | 0.0820 |

MTF-GLP | 0.8166 | 0.0894 | 0.1033 |

PNN | 0.8237 | 0.1287 | 0.2406 |

DiCNN | 0.8606 | 0.1115 | 0.0313 |

Proposed | 0.8675 | 0.1147 | 0.0452 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Huang, W.; Zhang, Y.; Zhang, J.; Zheng, Y.
Convolutional Neural Network for Pansharpening with Spatial Structure Enhancement Operator. *Remote Sens.* **2021**, *13*, 4062.
https://doi.org/10.3390/rs13204062

**AMA Style**

Huang W, Zhang Y, Zhang J, Zheng Y.
Convolutional Neural Network for Pansharpening with Spatial Structure Enhancement Operator. *Remote Sensing*. 2021; 13(20):4062.
https://doi.org/10.3390/rs13204062

**Chicago/Turabian Style**

Huang, Weiwei, Yan Zhang, Jianwei Zhang, and Yuhui Zheng.
2021. "Convolutional Neural Network for Pansharpening with Spatial Structure Enhancement Operator" *Remote Sensing* 13, no. 20: 4062.
https://doi.org/10.3390/rs13204062