MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering

Wang, Xianghai; Mu, Zhenhua; Bai, Shifu; Feng, Yining; Song, Ruoxi

doi:10.3390/rs14194867

Open AccessArticle

MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering

by

Xianghai Wang

^1,2

,

Zhenhua Mu

¹,

Shifu Bai

²,

Yining Feng

¹ and

Ruoxi Song

^3,*

¹

School of Geography, Liaoning Normal University, Dalian 116029, China

²

School of Computer and Information Technology, Liaoning Normal University, Dalian 116029, China

³

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(19), 4867; https://doi.org/10.3390/rs14194867

Submission received: 2 August 2022 / Revised: 18 September 2022 / Accepted: 27 September 2022 / Published: 29 September 2022

(This article belongs to the Section Remote Sensing Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

The difference and complementarity of spatial and spectral information between multispectral (MS) image and panchromatic (PAN) image have laid the foundation for the fusion of the two types of images. In recent years, MS and PAN image fusion (also known as MS-Pansharpening) has gained attention as an important research area in remote sensing (RS) image processing. This paper proposes an MS-Pansharpening algorithm based on dual constraint Guided Filtering in the nonsubsampled shearlet transform (NSST) domain. The innovation is threefold. First, the dual constraint guided image filtering (DCGIF) model, based on spatial region average gradient correlation and vector correlation formed by neighborhood elements is proposed. Further, the PAN image detail information extraction scheme, based on the model, is provided, which extracts more complete and accurate detail information, thus avoiding, to some extent, the spectral distortion caused by the injection of non-adaptive information. Second, the weighted information injection model, based on the preservation of the correlation between the band spectra, is proposed. The model determines the information injection weight of each band pixel based on the spectral proportion between bands of the original MS image, which ensures the spectral correlation between bands of the fused MS image. Finally, a new MS-Pansharpening algorithm in NSST domain is proposed. The MS and PAN high frequency sub-bands of NSST are used to extract more effective spatial details. Then the proposed DCGIF model is used to extract the effective spatial detail injection information through the weighted joint method based on the regional energy matrix. Finally, the weighted information injection model is used to inject it into each band of MS to complete information fusion. Experimental results show that the proposed approach has better fusion effect than some conventional MS-Pansharpening algorithms, which can effectively improve the spatial resolution of the fused MS image and maintain the spectral characteristics of MS.

Keywords:

multispectral image; panchromatic image; nonsubsampled shearlet transform; ms-pansharpening; guided filtering; feature injection; spectrum preservation

1. Introduction

The basic task of RS is to measure and record electromagnetic wave radiation signals reflected or emitted from the earth’s surface by sensors carried on airplanes or satellites, and to form different RS images according to different times, wavelengths or frequencies [1,2]. RS images with different spectral resolutions can be formed by the sensor according to the number of bands, such as PAN, MS and Hyperspectral (HS) images. Sensors can form RS images with different spatial resolutions according to the size of the distance between ground objects in the captured image [3]. In recent years, with the increasing demand and refinement of RS for earth observation, such as classification of ground features and change detection, RS image resources with both high spatial and spectral resolution are urgently needed. Currently, due to the limitation of sensor hardware technology, RS images with both high spatial and high spectral resolution cannot be obtained from a single device [4]. Fortunately, many commercial RS satellites can simultaneously carry many types of sensors to observe the same surface features and form different types of RS images. Satellites, such as IKONOS, Landsat 8, QuickBird, and World View-2, are equipped with PAN and MS sensors on the same platform to obtain both PAN and MS images of the same surface, which lays the foundation for using fusion methods to obtain RS images with both high spectral and spatial resolution.

In fact, the PAN is mixed image with visible light band ranges, which have a high spatial resolution but cannot display the colors of surfaces because of having a single band. MS covers several to more than a dozen bands, which can be used to distinguish surfaces by their spectral characteristics, and, at the same time, the type of surface can be determined by the structure and morphology of the surface reflected. However, because MS sensor wavelength range is shorter than that of PAN, MS has lower spatial resolution, which affects the accuracy of MS in earth observation applications, such as dynamic detection of ground objects, target recognition, etc. How to achieve information fusion between MS and PAN through information complementation (called MS-Pansharpening [5]) to obtain MS images with high spatial resolution has become an important research field in RS image processing [6,7].

At present, the MS-Pansharpening method mainly contains two categories: component substitution (CS) and multiresolution analysis (MRA) [5]. The CS method uses high spatial resolution components of PAN to replace low spatial resolution components of MS to obtain a fused MS with high spatial resolution. Early methods include the intensity-hue-saturation (IHS) method [8], principal component analysis (PCA) method [9], and Gram-Schmidt (GS) method [10]. Later, in order to better preserve the details of fused images, the generalized IHS (GIHS) algorithm [11] and the adaptive IHS (AIHS) algorithm [12] were proposed. This method assumes that there is a projection transformation that projects an MS image into another vector space so that its spatial structure and spectral information are well separated. Its advantage is that the spatial structure information of the fused image is well maintained, while the disadvantage is that local differences between MS and PAN often result in significant spectral distortion of the fused image [5]. MRA use multiscale transformation to extract spatial details of the PAN, and inject it into the resampled image of the MS to obtain a fused MS image with high spatial resolution. Early methods focused on obtaining spatial information of PAN using different types of wavelet and laplacian pyramid transforms [13], such as the wavelet method [14,15], nonsubsampled wavelet method [16], and generalized laplacian pyramid (GLP) method [17], etc. With the continuous development of MRA, there are some new methods, such as contourlet method [18], shearlet method [19], nonsubsampled contourlet transform (NSCT) method [20], and NSST method [21], etc. Unlike the CS method, which usually uses the difference between the PAN image and the MS image to obtain spatial information, the MRA method uses its own filters to extract spatial structure information from the PAN image independently and inject it into the MS image as equally as possible, which generally enables the fused MS to retain spectral information better, but the spatial detail structure is usually distorted.

In addition to the two MS-Pansharpening methods mentioned above, there are also some methods based on variational optimization (VO) and deep learning (DL) [6]. The former is based on the variational theory to construct an energy function by observing models or sparse representation theory, and further iteratively solving through optimization algorithms to obtain MS images with improved spatial resolution [22,23]. The latter uses deep learning theory, as in the literature [24], to assume that there is a complex nonlinear relationship between the observed image and the ideal fused MS image, and then uses some deep neural networks with significant non-linear characterization ability to automatically learn all parameters under the supervision of large training samples [25,26]. These two methods can improve the quality of fused images from different aspects, but the former generally has a high computational complexity, while the latter usually requires a large number of sample training networks, which limits the development of these two methods to a certain extent, refer to [6] for detailed analysis. In addition, the hybrid model approaches have also become hot in recent years, with deep learning combining the benefits of CS-based and MRA-based approaches in an effective and efficient way [27]. Overall, although Pansharpening fusion performance has been improved from different perspectives, how to maximize the spatial resolution of the fused image while effectively maintaining the spectral information of the original MS is still a challenging task and there are many issues to be studied.

In addition to wavelet transform, a number of multiscale transforms have successively emerged in the multiscale geometric analysis family that can effectively characterize the high-dimensional singular characteristics [28]. These transforms greatly improve the ability to capture the high-dimensional singular features, such as midline and polygon, by increasing the sensitivity to direction, which provides effective support for sparse representation of images and signals. Among them, Nonsubsampled multiscale transform [29,30,31,32] is highly valued in the fields of image enhancement, image watermarking and image fusion because of its translation invariance and information redundancy [33,34,35].

This paper presents the MS-Pansharpening method based on NSST and dual constraint guided filtering. The main contributions are listed as follows:

(1): The Dual Constraint Guided Image Filtering (DCGIF) model, based on spatial region average gradient correlation and vector correlation formed by neighborhood elements. is proposed. On this basis, the DCGIF detail extraction scheme is proposed. On the one hand, the spatial detail information map extracted from the PAN is more adaptive under the guidance of MS. On the other hand, it has better integrity and accuracy under the constraints of element neighborhood correlation and average gradient correlation.
(2): The MS spatial detail information injection model, based on band spectral correlation, is presented, which determines the information injection weight matrix of each band based on the spectral ratio between the original MS bands, ensuring the spectral correlation between the fused MS image bands.
(3): A new scheme of MS-Pansharpening in NSST domain is proposed. First, a more effective source of spatial detail information is obtained from the NSST high frequency sub-bands of MS and PAN. Then, effective spatial detail injection information is extracted by weighted fusion, based on the region energy matrix, using the proposed DCGIF model. Finally, the spatial detail information is injected into each MS band according to the information injection model to complete the information fusion. A large number of simulation experiments show that the proposed algorithm can effectively improve the spatial resolution of the fused MS image, while maintaining its spectral characteristics, and achieve better fusion results than some conventional fusion algorithms.

2. Related Works

2.1. Guided Image Filter

Guided image filter (GIF) is an adaptive weighted linear filter proposed by He et al. [36]. Unlike general filters, the GIF is performed under the guidance of another so-called “guidance image”, which can be either the image to be filtered itself or another image. Its implementation is as follows.

Set the guidance image, the filter input image and the filter output image to be

X

,

p

, and

q

, respectively. Assume that there is a local linear relationship between

X

and

q

in the local window

ω_{k}

centered on pixel

k

, that is:

q_{i} = a_{k} X_{i} + b_{k}, \forall i \in ω_{k}

(1)

where

q_{i} \in q

,

X_{i} \in X

.

(a_{k}, b_{k})

is a linear coefficient in

ω_{k}

and can be obtained by minimizing the following cost function:

E (a_{k}, b_{k}) = \sum_{i \in ω_{k}} ({(a_{k} X_{i} + b_{k} - p_{i})}^{2} + η a_{k}^{2})

(2)

here, parameter

η

is a regularization parameter preventing

a_{k}

from being too large. The values

a_{k}

and

b_{k}

are obtained by the linear regression method:

{\begin{matrix} a_{k} = \frac{\frac{1}{| ω_{k} |} \sum_{i \in ω_{k}} X_{i} p_{i} - μ_{k} {\bar{p}}_{k}}{δ_{k}^{2} + η} \\ b_{k} = {\bar{p}}_{k} - a_{k} μ_{k} \end{matrix}

(3)

δ_{k}^{2}

and

μ_{k}

represent the variance and mean of

X

within

ω_{k}

,

| ω_{k} |

the number of pixels within

ω_{k}

, and

{\bar{p}}_{k}

the average of pixels within

ω_{k}

, that is

{\bar{p}}_{k} = \frac{1}{| ω_{k} |} \sum_{i \in ω_{k}} p_{i}

(4)

Further, consider that one pixel in the image may be covered by multiple windows, so

a_{k}

and

b_{k}

obtained by Formula (3) may be different from the

ω_{k}

, resulting in the non-uniqueness of

q_{i}

. For this reason, the GIF uses the mean of

a_{k}

and

b_{k}

as the linear coefficient of Formula (1), the final filtering formula is:

q_{i} = {\bar{a}}_{i} X_{i} + {\bar{b}}_{i}

(5)

where

a_{k}

and

b_{k}

is

{\begin{matrix} {\bar{a}}_{i} = \frac{1}{| ω_{i} |} \sum_{k \in ω_{i}} a_{k} \\ {\bar{b}}_{i} = \frac{1}{| ω_{i} |} \sum_{k \in ω_{i}} b_{k} \end{matrix}

(6)

GIF is a fast local filtering edge detection filter, which can effectively avoid the gradient inversion effect, it can keep edge detail information while smoothing the image based on the guidance image, and it has been paid attention to in the fields of image segmentation, detail enhancement and image fusion in recent years [37,38,39].

2.2. NSST and Its Sub-Band Properties

For 2D signals, the composite wavelets (CW) theory [40,41] defines the following composite expansion affine systems:

ℋ_{A B} (ψ) = {ψ_{j, l, k} (x) = {| d e t A |}^{\frac{j}{2}} ψ (B^{l} A^{j} x - k) : j, l \in ℤ, k \in ℤ^{2}}

(7)

where

Ψ \in L^{2} (ℝ^{2})

, A and B are invertible matrices of

2 \times 2

, respectively, and

| d e t B | = 1

.

For the Formula (7),

\forall f \in L^{2} (ℝ^{2})

has:

\sum_{j, l, k} {| ⟨ f, ψ_{j, l, k} ⟩ |}^{2} = ‖ f^{2} ‖

The element

ψ_{j, l, k} (x)

in

ℋ_{A B} (ψ)

is called composite wavelet. As a general framework, the matrix

A^{j}

is associated with scale transformation and matrix

B^{l}

is associated with geometric transformation that are maintained in areas such as rotation and shearing. CW can change not only in scale and position, but also in direction as wavelets do, and it lays a foundation for constructing multi-directional multi-scale transform.

As a special case of CW, the Shearlet transform [31,42] is constructed as follows:

In Formula (7), the following anisotropic expansion and shear matrices are selected for A and B:

A = (\begin{matrix} 4 & 0 \\ 0 & 2 \end{matrix}), B = (\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix})

The basis function

ψ_{j, l, k} (x)

is given by the following formula:

\hat{ψ} (ξ) = \hat{ψ} (ξ_{1}, ξ_{2}) = {\hat{ψ}}_{1} (ξ_{1}) {\hat{ψ}}_{2} (\frac{ξ_{2}}{ξ_{1}})

(8)

where

ξ = (ξ_{1}, ξ_{2}) \in {\hat{ℝ}}^{2}

,

ξ_{1} \neq 0

, and

{\hat{ψ}}_{1}, {\hat{ψ}}_{2} \in C^{\infty} (\hat{ℝ})

,

s u p p ({\hat{ψ}}_{1}) \subset ⎡ - \frac{1}{2}, - \frac{1}{16} ⎤ ⋃ ⎡ \frac{1}{16}, \frac{1}{2} ⎤

,

s u p p ({\hat{ψ}}_{2}) \subset [- 1, 1]

. so that

\hat{ψ} \in C^{\infty} (\hat{ℝ})

,

s u p p (\hat{ψ}) \subset {⎡ - \frac{1}{2}, \frac{1}{2} ⎤}^{2}

. Further, assume that for

\forall j \geq 0

, the following is true:

{\begin{matrix} \sum_{j \geq 0} {| {\hat{ψ}}_{1} (2^{- 2 j} ω) |}^{2} = 1, i f | ω | \geq \frac{1}{8} \\ \sum_{l = - 2^{j}}^{2^{j} - 1} {| {\hat{ψ}}_{2} (2^{j} ω - l) |}^{2} = 1, i f | ω | \leq 1 \end{matrix}

(9)

This makes the base function

ψ_{j, l, k} (x)

of the Shearlet transform satisfied

s u p p ({\hat{ψ}}_{j, l, k}) \subset {{(ξ_{1}, ξ_{2}) : ξ_{1} \in [- 2^{2 j - 1}, - 2^{2 j - 4}] ⋃ [2^{2 j - 4}, 2^{2 j - 1}]}, | \frac{ξ_{2}}{ξ_{1}} + l 2^{- j} | \leq 2^{- 1}}

(10)

The discretization of Shearlet transform is achieved by combining a laplacian pyramid (LP) filter with a directional filter, where the NSST is a numerical calculation process that replaces the traditional LP with a nonsubsampled LP (NLP) filter [30].

Shearlet transform has achieved better performance than other multi-scale transforms in many aspects of image decomposition [31], such as the frequency domain and the spatial domain, which have very good compact support and localization characteristics, satisfy parabolic scale characteristics, have high directional sensitivity, and near-optimal sparse representation capabilities. Furthermore, because the NLP filtering process does not have down-sampling, it avoids distortion in the directional filtering and guarantees smooth shift invariance and information redundancy. Figure 1 shows the MS of San Clemente area from Worldview02 satellite. After 2-layer NSST decomposition, one low-frequency sub-band and two scale high-frequency sub-bands are obtained, the first scale high-frequency sub-band has two directional sub-bands in horizontal and vertical; the second scale high frequency sub-band has four directions, which are the horizontal direction and

45^{\circ}

direction, vertical and

135^{\circ}

direction.

Image fusion methods based on traditional MRA mostly use the correlation of eight neighborhood coefficients to determine the fusion rules of high-frequency sub-bands, and the directionality of these detail sub-bands is often ignored. In order to quantitatively analyze the directional characteristics of NSST high-frequency sub-bands in MS and PAN, the three groups of data sets shown in Figure 2 are selected, the San clement and Sydney area images of WorldView2 satellites, and the San Francisco region images of Quick-Bird satellite. Statistical analysis is carried out using the sub-band correlation evaluation method based on the variance of directional region [21]. The direction template [43] shown in Figure 3 corresponds to each high-frequency directional sub-band decomposed by NSST, and the variance shown in Equation (11) is used to measure and compare the magnitude of the directional neighborhood correlation with the eight neighborhood correlations. If the directional neighborhood variance is less than the eight neighborhood variances, the directional neighborhood coefficient correlation is stronger than the eight neighborhood coefficient correlations, and vice versa.

{\begin{matrix} μ (i, j) = \frac{1}{3 \times 3} \sum_{m = - 1}^{1} \sum_{n = - 1}^{1} c (i + m, j + n) \\ S D (i, j) = \sqrt{\frac{1}{3 \times 3} \sum_{m = - 1}^{1} \sum_{n = - 1}^{1} {(c (i + m, j + n) - μ (i, j))}^{2}} \end{matrix}

(11)

where

c (i, j)

is the high frequency directional subband coefficient at

(i, j)

,

μ (i, j)

and

S D (i, j)

are the mean of the coefficients in the

3 \times 3

region centered on

(i, j)

and the regional variance corresponding to the corresponding directional template, Table 1 shows the statistical results.

It can be seen from Table 1 that the correlation of the directional neighborhood coefficients in the high frequency sub-bands of NSST was stronger than that of the eight neighborhood coefficients in both MS and PAN, which provided an effective way to further explore the correlation of the high frequency sub-band coefficients of NSST.

3. Methodology

3.1. Construction of the Dual Constraint Guided Filtering Model

The GIF model often produces halo effects when applied as a local filter to smooth edges. In reference [44], GIF is improved, and a GIF model based on edge preservation weighting is presented, inspired by this model, the GIF model based on the constraints of neighborhood elements vector correlation and regional average gradient correlation of the original and guidance images is presented, based on the region characteristics of the image. Effective and appropriate detail features can be adaptively extracted from the guidance image, which lays a foundation for improving the spatial resolution and maintaining spectral information of the MS fused image after information injection.

3.1.1. Neighborhood Elements Vector and Average Gradient Correlation Constraint

In order to effectively measure the correlation between the corresponding location of MS and the PAN, a measurement method based on the neighborhood elements vector angle is proposed, which, together with the average gradient, constrains the extraction of PAN details.

(1) Neighborhood element vector angle

For the PAN image

X

and MS image

Y

, take the

n \times n

neighborhood with

(i, j) (i \in {1, 2, \dots, m}, j \in {1, 2, \dots, n})

as the center (for the adjacent region with edge element as the center, it can be obtained by edge extension method). The neighborhood elements vector angle refers to the adjacent region elements, selected according to the “from top to bottom, left to right”, forming the angle between vectors

x = (\dots, x_{r, s}, \dots)

and

y = (\dots, y_{r, s}, \dots)

(see Figure 4), where

r = i - ⎣ n / 2 ⎦, i - ⎣ n / 2 ⎦ + 1, \dots, i + ⎣ n / 2 ⎦

;

s = j - ⎣ n / 2 ⎦, j - ⎣ n / 2 ⎦ + 1, \dots, j + ⎣ n / 2 ⎦

, the calculation formula is shown in (12), and the range of included angle is

[0, π / 2]

.

S A M (i, j) \equiv θ (i, j) = a r c c o s \frac{\sum_{r = i - ⎣ n / 2 ⎦}^{i + ⎣ n / 2 ⎦} \sum_{s = j - ⎣ n / 2 ⎦}^{j + ⎣ n / 2 ⎦} x_{r, s} y_{r, s}}{\sqrt{(\sum_{r = i - ⎣ n / 2 ⎦}^{i + ⎣ n / 2 ⎦} \sum_{s = j - ⎣ n / 2 ⎦}^{j + ⎣ n / 2 ⎦} x_{r, s}^{2}) (\sum_{r = i - ⎣ n / 2 ⎦}^{i + ⎣ n / 2 ⎦} \sum_{s = j - ⎣ n / 2 ⎦}^{j + ⎣ n / 2 ⎦} y_{r, s}^{2})}}

(12)

The smaller the

θ

value is, the stronger the correlation between the two elements is, and vice versa. The neighborhood elements vector angle reflects the correlation of the corresponding elements, which makes the measurement of the correlation of heterogeneous RS images, such as PAN and MS, more robust.

(2) Average gradient

The gradient of image pixel reflects the change rate and direction of pixel gray, which can reveal the direction and intensity of image edge. For the 2D image function

f (\cdot, \cdot)

, the gradient at

(i, j)

is defined as follows:

\nabla f (i, j) \equiv g r a d (f (i, j)) = {[\frac{\partial f (i, j)}{\partial i}, \frac{\partial f (i, j)}{\partial j}]}^{T}

(13)

the amplitude is

M (i, j) \equiv m a g (\nabla f (i, j)) = \sqrt{{(\frac{\partial f (i, j)}{\partial i})}^{2} + {(\frac{\partial f (i, j)}{\partial j})}^{2}}

(14)

Figure 5 shows the edge detail image obtained by gradient operator for MS and PAN.

For the

n \times n

neighborhood (see Figure 4) in the PAN with

(i, j)

as the center, the average gradient can reflect the clarity of the region block well, the small detail contrast and the change of texture. In this paper, the neighborhood elements vector angle and the average gradient are used as the constraint terms of the proposed improved guided filtering model to ensure the balance of extracting detail information from the PAN, in order to avoid the spectral distortion caused by over injection of detail information in the process of information fusion. The numerical expression of the average gradient of

n \times n

neighborhood is shown in Equation (15).

A G (i, j) \equiv \bar{\nabla f} (i, j) = \frac{1}{n \times n} \sum_{r = i - ⎣ n / 2 ⎦}^{j + ⎣ n / 2 ⎦} \sum_{s = j - ⎣ n / 2 ⎦}^{j + ⎣ n / 2 ⎦} \sqrt{\frac{{(f (r + 1, s) - f (r, s))}^{2} + {(f (r, s + 1) - f (r, s))}^{2}}{2}}

(15)

3.1.2. Region Edge Recognition Function Based on the Neighborhood Elements Vector Angle and the Average Gradient

Let the size of input image

p

and guidance image

X

be

m \times n

, for any position

(i, j)

in

p

and

X

, define the

3 \times 3

neighborhood

ω_{i, j}

with

(i, j)

as the center (see Section 3.1.1). Let

S A M_{X & p_ω} (i, j)

be the neighborhood elements vector angle between adjacent area

ω

of

p

and

X

(see Formula (12)) and

S A M_{X & p}

be the overall element vector angle of

p

and

X

(that is, the whole image is regarded as the adjacent area of its center point); Let

A G_{X_ω} (i, j)

be the average gradient value of the

ω

in

X

(see Equation (15)), and

A G_{X}

be the overall average gradient value of

X

, and normalize

A G_{X_ω} (i, j)

and

A G_{X}

to the same value range

[0, π / 2]

as

S A M_{X & p_ω} (i, j)

and

S A M_{X & p}

. The region edge recognition function

Ψ (i, j)

is constructed as follows:

Ψ (i, j) \equiv \frac{s i n (S A M_{X & p_ω} (i, j)) + s i n (A G_{X_ω} (i, j)) + λ}{\sin (S A M_{X & p}) + \sin (A G_{X}) + λ}

(16)

where

λ

is a small constant (1 × 10⁻⁶ is used in this article), in order to avoid zero denominator.

In order to analyze the effectiveness of the constructed edge recognition function

Ψ (i, j)

, we selected three MS and PAN images with different characteristics and make distribution statistics of their region recognition functions. Figure 6 is the visualization result of the distribution statistics of the experimental image (only PAN). It can be seen that the defined edge recognition function could recognize the smooth region and the edge texture region of the image well, which made it possible for the adaptive constraint to obtain the detail information from the guidance image.

3.1.3. Model Construction

For the same position

(i, j) (i \in {1, 2, \dots, m}, j \in {1, 2, \dots, n})

of the input image

p

and the guidance image

X

with the size of

m \times n

, a window with radius

r

centered on

(i, j)

is defined, based on the proposed region edge recognition function

Ψ (i, j)

, the minimum cost function of the improved guide filtering id defined as follows:

E (a_{i, j}, b_{i, j}) = \sum_{(s, t) \in ω_{(i, j)_r}} ({(a_{i, j} X_{s, t} + b_{i, j} - p_{s, t})}^{2} + \frac{η}{Ψ (i, j)} a_{i, j}^{2})

(17)

(a_{i, j}, b_{i, j})

is the linear coefficient of

ω_{(i, j)_r}

,

X_{s, t} \in X, p_{s, t} \in

p,

η

is the regularization parameter (0.64 in this experiment).

Based on the linear regression method,

a_{i, j}

and

b_{i, j}

can be obtained:

{\begin{matrix} a_{i, j} = \frac{\frac{1}{| ω_{(i, j)_r} |} \sum_{(s, t) \in ω_{(i, j)_r}} X_{s, t} p_{s, t} - μ_{i, j} {\bar{p}}_{i, j}}{δ_{i, j}^{2} + \frac{η}{Ψ (i, j)}} \\ b_{i, j =} {\bar{p}}_{i, j} - a_{i, j} μ_{i, j} \end{matrix}

(18)

where

δ_{i, j}^{2}

and

μ_{i, j}

represent the variance and mean value of the guidance image

X

in the window

ω_{(i, j)_r}

, respectively, and

{\bar{p}}_{i, j}

is shown in Equation (19):

{\bar{p}}_{i, j} = \frac{1}{| ω_{(i, j)_r} |} \sum_{(s, t) \in ω_{(i, j)_r}} p_{s, t}

(19)

Let the output image of

p

guided by

X

be

q

, further from Equations (17)–(19), we can obtain the Dual Constraint Guided Image Filtering (DCGIF) model based on the neighborhood elements vector angle and the average gradient as follows:

q_{i, j} = {\bar{a}}_{i, j} X_{i, j} + {\bar{b}}_{i, j}

(20)

where

X_{i, j} \in X

,

q_{i, j} \in q

,

{\bar{a}}_{i, j}

and

{\bar{b}}_{i, j}

as follows:

{\begin{matrix} {\bar{a}}_{i, j} = \frac{1}{| ω_{(i, j)_r} |} \sum_{(s, t) \in ω_{(i, j)_r}} a_{s, t} \\ {\bar{b}}_{i, j} = \frac{1}{| ω_{(i, j)_r} |} \sum_{(s, t) \in ω_{(i, j)_r}} b_{s, t} \end{matrix}

(21)

The core of the guided filter is the determination of the linear coefficients in the window. Differing from the traditional GIF, which uses the regularization parameter

η

to avoid the linear parameter

a_{i, j}

in the window

ω_{(i, j)_r}

being too large, the DCGIF model adaptively adjusts

a_{i, j}

and

b_{i, j}

, based on the neighborhood elements vector angle and the average gradient of the region edge recognition function, to make the extracted filtered texture details more accurate. It is applied to MS-Pansharpening to ensure that the spatial information extracted from Pan has stronger adaptability under MS guidance.

Figure 7 shows a comparison between the traditional GIF model and the DCGIF model using MS as a guidance image to extract spatial detail information from the PAN. The window radius is selected as 1. From the displayed image, it can be seen that the spatial detail information extracted by the DCGIF was more delicate, and, at the same time, it had a better matching degree with the original image, which laif the foundation for improving the fusion effect of MS-Pansharpening.

3.2. MS-Pansharpening Algorithm Based on DCGIF in NSST Domain

3.2.1. Reconstruction of MS Intensity Component Guided by High Frequency Information of NSST from PAN

In order to make the spatial details extracted from the PAN better match the MS image, this paper used the NSST high frequency sub-band of the PAN as the guidance information to reconstruct the intensity component I from the IHS transform of the MS. The process is shown in Figure 8.

First, an IHS transform was performed on the MS sampled to the same size of the PAN; then, the I component of the MS was decomposed into two scales of NSST, and one low-frequency sub-band and six high-frequency direction sub-bands were obtained. The high frequency sub-bands included the horizontal

(0^{°})

, vertical

(90^{°})

direction of the first scale and the horizontal

(0^{°})

, anti-diagonal

(45^{°})

, vertical

(90^{°})

, principal diagonal

(135^{°})

of the second scale (see Figure 3). The low frequency sub-band of the I component of MS was preserved as the low frequency sub-band of the final reconstructed I component to maintain the overall overview and spatial characteristics of the original MS. For each high frequency sub-band combined with the high frequency sub-band corresponding to the PAN, a joint high-frequency sub-band was generated through the following process of maximum energy in the direction region.

For the I component of MS and each high frequency sub-band of direction

d (d \in {0^{°}, 45^{°}, 90^{°}, 135^{°}})

corresponding to the same level of PAN decomposed by NSST(denoted as “

I_{d}^{h}

” and “

P A N_{d}^{h}

”), the direction energy of the direction coefficient in the

3 \times 3

neighborhood

d

centered on the current position

(i, j)

was calculated by traversing from top to bottom and from left to right (see Equation (22)). For the edge coefficient case, the symmetric continuation could be used. Figure 9 shows the high frequency sub-band neighborhood direction diagram.

{\begin{array}{l} D E_{T, 0^{°}} (i, j) = T^{2} (i, j - 1) + T^{2} (i, j) + T^{2} (i, j + 1) \\ D E_{T, 45^{°}} (i, j) = T^{2} (i - 1, j + 1) + T^{2} (i, j) + T^{2} (i + 1, j - 1) \\ D E_{T, 90^{°}} (i, j) = T^{2} (i - 1, j) + T^{2} (i, j) + T^{2} (i + 1, j) \\ D E_{T, 135^{°}} (i, j) = T^{2} (i - 1, j - 1) + T^{2} (i, j) + T^{2} (i + 1, j + 1) \end{array}

(22)

where

T \in {I_{d}^{h}, P A N_{d}^{h}}

,

D E_{T, d} (i, j)

represents the directional energy of

d

at the current position

(i, j)

of the high frequency subband

T

.

Further, the joint high frequency sub-band

{\hat{I}}_{d}^{h}

is determined according to the Formula (23):

{\hat{I}}_{d}^{h} (i, j) = {\begin{matrix} I_{d}^{h} (i, j), i f D E_{I_{d}^{h}, d} (i, j) \geq D E_{P A N_{d}^{h}, d} (i, j) \\ P A N_{d}^{h} (i, j), e l s e \end{matrix}

(23)

After determining the level and direction of the joint high frequency sub-band of the I component of the MS, the I component

I_{M S & P A N}^{R}

of the reconstructed MS could be obtained by inverse NSST.

3.2.2. Extraction of Decision Texture Details for PAN Based on DCGIF Model

(1) PAN detail information coefficient fusion based on region energy weighted matrix

In order to obtain more abundant and more adaptable texture detail information from the PAN, and, at the same time, to contain as little redundant information as possible, this paper filtered the PAN twice based on the proposed DCGIF model to obtain the detail information. Further, the final decision texture details were obtained through a coefficient fusion mechanism weighted by the region energy matrix. The overall process is shown in Figure 10.

First, with the reconstructed component

I_{M S & P A N}^{R}

of MS intensity component (see Section 3.2.1) and PAN as the guidance image, the DCGIF model was used to filter the PAN twice to obtain two texture information images,

D e t a i l_{P A N \to P A N}

and

D e t a i l_{I_{M S & P A N}^{R} \to P A N}

. Further, the

3 \times 3

local region energy matrix

D e_E n_{P A N \to P A N}

and

D e_E n_{I_{M S & P A N}^{R} \to P A N}

corresponding to the two texture information images were calculated. The two matrices were the same size as the two texture information images, and the elements of the matrix were the sum of squares of the elements in the

3 \times 3

neighborhood with the same location as the center of the corresponding texture information image.

Then, the corresponding two-weight matrix

ω_{P A N \to P A N}

and

ω_{I_{M S & P A N}^{R} \to P A N}

was constructed, based on matrix

D e_E n_{P A N \to P A N}

and

D e_E n_{I_{M S & P A N}^{R} \to P A N}

:

{\begin{matrix} ω_{P A N \to P A N} (i, j) = \frac{D e_E n_{P A N \to P A N} (i, j)}{D e_E n_{P A N \to P A N} (i, j) + D e_E n_{I_{M S & P A N}^{R} \to P A N} (i, j)} \\ ω_{I_{M S & P A N}^{R} \to P A N} (i, j) = \frac{D e_E n_{I_{M S & P A N}^{R} \to P A N} (i, j)}{D e_E n_{P A N \to P A N} (i, j) + D e_E n_{I_{M S & P A N}^{R} \to P A N} (i, j)} \end{matrix}

(24)

Finally, the final decision texture details matrix for injection into MS was obtained by (25):

\begin{matrix} D e t a i l_{F i n a l} (i, j) = (ω_{P A N \to P A N} (i, j) \times D e t a i l_{P A N \to P A N} (i, j)) \\ + (ω_{I_{M S & P A N}^{R} \to P A N} (i, j) \times D e t a i l_{I_{M S & P A N}^{R} \to P A N} (i, j)) \end{matrix}

(25)

(2) Discussion on window radius

r

in DCGIF Model

In order to discuss the influence of the filtering window radius

r

on the result and the fusion result after injection into the target image in DCGIF model, a statistical experiment was carried out on MS and PAN in San Clement area (see Figure 6). The filtering window radii were selected as 1, 2, 4 and 8, respectively. The corresponding PAN was filtered by DCGIF using the I component of MS as the guidance image to obtain the filtered detail image. To further observe the details injected into the target image, the details obtained were injected into the MS by simple matrix addition (see Figure 11 for specific results); root mean square error (RMSE) [45] and spectral angle mapper (SAM) [46] were used as objective evaluation indices to evaluate the fused MS (see Table 2).

From the above experimental results, it can be seen that as the radius of the filtering window increased, the image with filtered detail information appeared to be over-sharpened, and, at the same time, the fused image had some spectral distortion in the spectral information, such as the distortion of residential areas was more obvious, while in the spatial information, the texture detail information that was over injected created overlap in the local region. The larger the window radius was, the larger the RMSE value of the fused image, and the larger the SAM value. For this reason, the radius of the DCGIF of the PAN in the subsequent algorithm of this article was selected as 1 filtering window.

3.2.3. Detail Information Injection Model for MS Based on Spectral Correlation

Multi-band RS imagery is an image formed from reflected light from different spectral bands for the same object. The correlation between bands in a multi-band RS image is often referred to as spectral correlation. To investigate the spectral correlation between adjacent bands of MS, we selected eight sets of MS, and the correlation between adjacent bands in MS was calculated using the “spectral correlation coefficient” shown in Equation (26). The statistical results are shown in Table 3.

r_{i j} = \frac{\sum_{x = 1}^{M} \sum_{y = 1}^{N} ((I_{i} (x, y) - {\bar{I}}_{i}) (I_{j} (x, y) - {\bar{I}}_{j}))}{\sqrt{\sum_{x = 1}^{M} \sum_{y = 1}^{N} {(I_{i} (x, y) - {\bar{I}}_{i})}^{2} \sum_{x = 1}^{M} \sum_{y = 1}^{N} {(I_{j} (x, y) - {\bar{I}}_{j})}^{2}}}

(26)

where

I_{i}

and

I_{j}

are the

i

and

j

band images (size

M \times N

);

{\bar{I}}_{i} = \frac{1}{M \times N} \sum_{x = 1}^{M} \sum_{y = 1}^{N} I_{i} (x, y)

,

{\bar{I}}_{j} = \frac{1}{M \times N} \sum_{x = 1}^{M} \sum_{y = 1}^{N} I_{j} (x, y)

;

r_{i j}

is the correlation coefficient of the

i

and

j

band.

From the statistical results in Table 3, it can be seen that the correlation coefficients between adjacent bands of MS were between

0.9 ~ 1

and there was a strong correlation between adjacent bands. Therefore, in the MS-Pansharpening algorithm based on detail information injection, if the detail injection was only applied to a certain band or a certain component of MS, the relationship between the original MS bands would be broken to some extent, so that the spectral information of the fused MS would be distorted. Based on this, a weighted detail information injection model based on spectral correlation is proposed:

Assume that the size of the MS is

M \times N

and

I_{M S_B a n d_k}

is its

k (k \in {1, 2, \dots, n})

band, where

n

is the total number of bands of the MS. For texture detail matrix

D e t a i l_{F i n a l}

(see Equation (25)), construct the weight matrix

α_{M S_B a n d_k} (k \in {1, 2, \dots, n})

as shown in Formula (27):

α_{M S_B a n d_k} (i, j) = \frac{I_{M S_B a n d_k} (i, j)}{\sum_{k = 1}^{n} I_{M S_B a n d_k} (i, j)}

(27)

According to the following Formula (28), the texture detail information is injected to form the final fused MS

{\hat{I}}_{M S_B a n d_k}

:

{\hat{I}}_{M S_B a n d_k} (i, j) = I_{M S_B a n d_k} (i, j) + α_{M S_B a n d_k} (i, j) \times D e t a i l_{F i n a l} (i, j)

(28)

where

k \in {1, 2, \dots, n}

,

i \in {1, 2, \dots, M}

,

j \in {1, 2, \dots, N}

.

3.2.4. The Process of Algorithm Implementation

Based on the previous analysis and discussion, the implementation process of the MS-Pansharpening algorithm in this article is given, and the overall flow chart of the algorithm is shown in Figure 12.

The implementation process of the Algorithm 1 is as follows:

Algorithm 1 Procedures of the proposed approach.

Input: The registered PAN and MS RS images.

Initialization:: The MS is sampled to the same size as the PAN, and then the MS is transformed by IHS, the intensity component I is extracted, and the component H and component S are retained.

1:: The I component and PAN are transformed by three layers of NSST to obtain the corresponding low-frequency and high-frequency sub-bands. For each direction sub-band, according to its direction, and according to Formulas (22) and (23), the fusion rule of maximum energy in the direction region is used to obtain the joint high-frequency sub-band, and then inverse NSST is used to obtain the reconstructed I component $I_{M S & P A N}^{R}$ ;

2:: Use the DCGIF model to filter the PAN twice. First, the $I_{M S & P A N}^{R}$ obtained in Step 1 is used as the guidance image to obtain the texture detail image $D e t a i l_{I_{M S & P A N}^{R} \to P A N}$ . Second, the PAN is used as the guidance image to obtain the texture detail image $D e t a i l_{P A N \to P A N}$ ;

3:: The $3 \times 3$ local region energy matrices $D e_E n_{I_{M S & P A N}^{R} \to P A N}$ and $D e_E n_{P A N \to P A N}$ corresponding to $D e t a i l_{I_{M S & P A N}^{R} \to P A N}$ and $D e t a i l_{P A N \to P A N}$ are calculated respectively, and the two-weight matrices $ω_{I_{M S & P A N}^{R} \to P A N}$ and $ω_{P A N \to P A N}$ are constructed (see Equation (24));

4:: According to Formula (25), $D e t a i l_{I_{M S & P A N}^{R} \to P A N}$ and $D e t a i l_{P A N \to P A N}$ are weighted average based on the weight matrix $ω_{I_{M S & P A N}^{R} \to P A N}$ and $ω_{P A N \to P A N}$ to obtain the final decision texture detail information matrix $D e t a i l_{F i n a l}$ for injection into MS;

5:: For the $I_{M S & P A N}^{R}$ constructed by Step 1 and the components H and S reserved by Step 1, the weight matrices $α_{I_{M S & P A N}^{R}}$ , $α_{H}$ , and $α_{S}$ are obtained according to Equation (27);

6:: According to the weighted detail information injection model based on (28), the detail information matrix $D e t a i l_{F i n a l}$ is injected into $I_{M S & P A N}^{R}$ , H, and S to form the components ${\hat{I}}_{M S & P A N}^{R}$ , $\hat{H}$ , and $\hat{S}$ after the detail information is injected.

7:: Inverse IHS is used to transform the ${\hat{I}}_{M S & P A N}^{R}$ , $\hat{H}$ , and $\hat{S}$ to obtain the fused MS image.

Output: the fused MS image.

4. Experiment and Analysis

Experiments were conducted with an Intel Core i7 6700 CPU 3.4 GHz with 32 GB RAM in a Windows7 system, and the experimental platform was Matlab R2014a. For all the methods that required NSST, the ‘maxflat’ nonsubsampled multiscale filter and improved shear filter banks were selected, and three levels of NSST decomposition were performed on the MS and PAN. The number of sub-bands in each scale direction was 2, 4 and 8, respectively. At the same time, seven classical fusion algorithms were selected for comparison experiments, which included IHS algorithm, traditional GIF algorithm [36], Á Trous Wavelet Transform (ATWT) algorithm [47], Additive Wavelet Luminance Proportional (AWLP) algorithm [47], Adaptive Sparse Representation (ASR) algorithm [48], Deep Leaning algorithm [49] and NSST + PCNN + SR algorithm [21].

4.1. Data Set

Five sets of RS image data from two different satellites, QucikBird02 and WorldView02 (see Figure 13) were obtained. Among them, Test-RS-1, Test-RS-3 and Test-RS-5 were from the QucikBird02 satellite in the San Francisco area, while PAN had a 0.7 m spatial resolution and MS had a 2.8 m spatial resolution. The sizes of MS and PAN images were

400 \times 400

and

800 \times 800

, respectively. Test-RS-2 and Test-RS-4 were from the WorldView02 satellite and were taken from the San Clemente and Sydney regions. The spatial resolution of the RS images was 0.5 m. The selected MS and PAN experimental images took into account a variety of landform information, including plains, waters, residential areas, docks, mountains and road areas. The sizes of MS and PAN images were

400 \times 400

and

800 \times 800

, respectively. All the PAN and MS images in the dataset to be fused were registered images. The PAN was a single-band image and the MS contained four bands of R, G, B, Near Infrared (NIR), but, due to the three-channel image characteristics, the MS shown in this paper only contained R, G, B.

This paper divided the experiment into two parts: one was the experiment of degraded data on Test-RS-1~Test-RS-4, and the other was the real data experiment of Test-RS-5. For the real data experiment, first the MS was down-sampled to obtain a reduced spatial resolution MS image, so that the original MS could be used as a reference image. Further, the reduced spatial resolution MS was sampled up to the same size as the PAN, and then the fusion experiments were performed.

4.2. Degraded Data Experiment

(1) Subjective evaluation

Figure 14 shows the MS and the fusion results of eight different algorithms for four sets of degraded data, with a local area enlarged three times, where “NSST + DCGIF” was the algorithm in this paper.

From the perspective of spectral preservation, the spectral distortions in the result obtained by traditional IHS algorithm, ASR algorithm and deep learning algorithm were more obvious, among which the overall color of Test-RS-1, especially the green forest area, was brighter. The color information of the IHS algorithm in Test-RS-2 was lighter on roads, on the top of red houses, and on the green vegetation area than the source MS, and the whole image was darker. The results obtained by ASR and deep learning algorithms were also distorted in the spectrum, and the color of vegetation and roof was bright. The results obtained by the other five algorithms were similar in spectral information to the source MS image. From the spatial details, although the traditional GIF algorithm maintained good spectral information, overlap occurred at the edge texture, and the details were blurred, such as Test-RS-2 vegetation information appearing in blocks, and Test-RS-3 producing black patches in tree forest areas, and there was loss of some texture details. The sea surface texture details were also lost, so it appeared too smooth. The local magnification showed that the ATWT algorithm maintained the spectrum well, but some of the spatial detail information was lost and the corresponding texture detail edges were not significantly improved. The AWLP algorithm was subjectively better but the spectrum was distorted compared to the MS image and the colors were more vivid in the local area. The NSST + SR + PCNN algorithm improved the spatial detail information greatly, but the spectrum was distorted to some extent, especially in vegetation areas. The resulting image from the NSST + DCGIF algorithm proposed in this paper, kept good spectral information compared with the other comparison algorithms, and the spectral information was close to the original MS in vegetation, sea waves, land and other areas, The spatial information was also greatly enhanced. Detailed information, such as sea waves, land, forests and so on, were enhanced, the edge contour information was clearer of buildings, road and land were clearly reflected, and the texture structure of different objects could be distinguished. It effectively enhanced the spatial resolution, while maintaining the spectral information of MS.

(2) Objective evaluation

For the objective evaluation of the degraded data experimental, the commonly used RMSE, erreur relative global adimensionnelle de Synthese (ERGAS) [50], structural similarity index (SSIM) [51] and correlation coefficient (CC) [52] as the spatial quality evaluation index were applied. Relative average spectral error (RASE) [45] and SAM were used as spectral quality evaluation indices. Table 4 gives the statistical results of the objective evaluation indices for the fused images of different algorithms corresponding to Figure 14.

It can be seen from Table 4 that the proposed fusion algorithm achieved a good evaluation index in most cases, and achieved better statistical results than the other methods. In particular, the objective evaluation of SAM demonstrated that the proposed NSST-DCGIF avoided, to a certain extent, the spectral distortion phenomenon caused by the injection of non-adaptive information, which showed that the algorithm could maintain the spectral characteristics of MS very well and made full use of the high spatial resolution of the PAN image to effectively improve the spatial resolution of the fused image. Compared with the other seven algorithms, it had obvious advantages.

4.3. Real Data Experiment

(1) Subjective evaluation

Figure 15 shows the target image fused with 8 different algorithms for real data. It can be seen that the GIF algorithm performed poorly in spatial detail information, the edge information of the fused result was partially lost, and the spectral information was darkened, while the color distortion was more obvious in vegetation areas. The traditional IHS algorithm, ASR algorithm and deep learning algorithm had different degrees of spectral distortion. ATWT, AWLP, NSST + SR + PCNN and the proposed algorithm were similar in spectral aspect to the source MS, but the proposed algorithm performed better in spatial detail information enhancement, with rich textural information and obvious edge information.

(2) Objective evaluation

In the real data experiment, we chose quality with no reference (

Q N R

) [53] index to objectively evaluate the fused image.

Q N R

was used to calculate the cross-covariance between bands as the overall quality evaluation index. It was composed of the spectral distortion evaluation index

D_{λ}

and the spatial detail distortion evaluation index

D_{s}

. Table 5 shows the statistical results of the objective evaluation indices of the fusion images of the different algorithms corresponding to Figure 15. It can be seen from Table 5 that the proposed algorithm was better than the other algorithms in

QNR

and

D_{s}

, and the Deep learning algorithm performed better in

D_{λ}

. From the comprehensive subjective evaluation and objective evaluation, the algorithm in this paper in the real data experiment was better than the other comparative algorithms.

4.4. Computation Complexity Analysis

Table 6 gives a comparative analysis of the time spent by the eight algorithms on five groups of data sets. From the statistical results in the table, it can be seen that ATWT, AWLP, IHS and traditional GIF algorithms consumed less time, but the fusion results obtained by subjective and objective evaluations were not ideal, resulting in spectral information distortion, loss of spatial details and insufficient spatial information promotion. The time of the Deep learning algorithm counted in this paper did not include the time of the training process, and the training time of deep learning algorithm is generally longer. The ASR algorithm took the longest time. The NSST + SR + PCNN algorithm took about 2.5 times as long as the proposed NSST + DCGIF algorithm, but the quality of the fusion was not as good as that of this algorithm. It can be seen that NSST + DCGIF maintained the validity of the algorithm while taking into account computational efficiency.

5. Conclusions

In this paper, the MS-Pansharpening algorithm in the NSST domain, based on dual constraint guided filtering, is presented. First, the intensity component of MS and the PAN are decomposed by NSST, and then the intensity component of MS is reconstructed by using the principle of directional region energy maximization. Then the spatial detail information of PAN is extracted by using the guided filtering model proposed in this paper, which is based on the average gradient correlation and vector correlation formed by the current location neighborhood. The model ensures the applicability, integrity and accuracy of the extracted information. Finally, the extracted spatial details are injected into each component of MS to get the final fused image, based on the weighted detail information injection model, which maintains the spectral correlation proposed. The proposed NSST + DCGIF model avoids the spectral distortion phenomenon caused by the injection of non-adaptive information to a certain extent. It effectively improves the spatial resolution of the fused image maintaining the spatial detail information while maintaining the spectral information well. Through comparative experimental analysis, subjective and objective indications, especially objective evaluation, the proposed algorithm demonstrated better fusion results than some of the typical MS-Pansharpening algorithms currently available.

Author Contributions

Conceptualization and methodology, X.W.; Formal analysis and writing—original draft preparation, Z.M.; Investigation validation, S.B.; Data curation, Y.F.; Supervision, Writing-review & editing, R.S.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 41971388) and the Innovation Team Support Program of Liaoning Higher Education Department (Grant No. LT2017013).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the editors and the reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chuvieco, E. Sensors and Remote Sensing Satellites; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Toth, C.; Jozkow, G. Remote sensing platforms and sensors: A survey. ISPRS J. Photogramm. Remote Sens. 2016, 115, 22–36. [Google Scholar] [CrossRef]
Chuvieco, E. Fundamentals of Satellite Remote Sensing; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Siok, K.; Ewiak, I.; Jenerowicz, A. Multi-sensor fusion: A Simulation approach to pansharpening aerial and satellite images. Sensors 2020, 20, 7100. [Google Scholar] [CrossRef] [PubMed]
Alparone, L.; Aiazzi, B.; Baronti, S.; Garzelli, A. Remote Sensing Image Fusion; CRC Press: Boca Raton, FL, USA, 2015. [Google Scholar]
Meng, X.; Xiong, Y.; Shao, F.; Shen, H.; Sun, W.; Yang, G.; Yuan, Q.; Fu, R.; Zhang, H. A large-scale benchmark data set for evaluating pansharpening performance: Overview and implementation. IEEE Geosci. Remote Sens. Mag. 2021, 9, 18–52. [Google Scholar] [CrossRef]
Kaur, G.; Saini, K.S.; Singh, D.; Kaur, M. A comprehensive study on computational pansharpening techniques for remote sensing images. Arch. Comput. Methods Eng. 2021, 28, 4961–4978. [Google Scholar] [CrossRef]
Carper, W.; Lillesand, T.; Kiefer, R. The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogramm. Eng. Remote Sens. 1990, 56, 459–467. [Google Scholar]
Chavez, P.S.; Kwarteng, A.W. Extracting spectral contrast in Landsat Thematic Mapper image data using selective principal component analysis. Photogram. Eng. Remote Sess. 1989, 55, 339–348. [Google Scholar]
Aiazzi, B.; Baronti, S.; Selva, M. Improving component substitution pansharpening through multivariate regression of MS + Pan data. IEEE Trans. Geosci. Remote Sens. 2007, 45, 3230–3239. [Google Scholar] [CrossRef]
Tu, T.M.; Su, S.C.; Shyu, H.C.; Huang, P.S. A new look at IHS-like image fusion methods. Inf. Fusion 2001, 2, 177–186. [Google Scholar] [CrossRef]
Rahmani, S.; Strait, M.; Merkurjev, D.; Moeller, M.; Wittman, T. An adaptive IHS pan-sharpening method. IEEE Geosci. Remote Sens. Lett. 2010, 7, 746–750. [Google Scholar] [CrossRef]
Amolins, L.; Zhang, Y.; Dare, P. Wavelet based image fusion techniques-an troduction, review and comparison. ISPRS J. Photogramm. Remote Sens. 2007, 62, 249–263. [Google Scholar] [CrossRef]
Li, H.; Manjunath, B.S.; Mitra, S.K. Multisensor image fusion using the wavelet transform. Graph. Models Image Process. 1995, 57, 235–245. [Google Scholar] [CrossRef]
Zhou, J.; Civco, D.L.; Silander, J.A. A wavelet transform method to merge Landsat TM and SPOT panchromatic data. Int. J. Remote Sens. 1998, 19, 743–757. [Google Scholar] [CrossRef]
Gonzalez-Audicana, M.; Otazu, X.; Fors, O.; Seco, A. Comparison between mallat’s and the “a trous” discrete wavelet transform based algorithms for the fusion of multispectral and panchromatic images. Int. J. Remote Sens. 2005, 26, 595–614. [Google Scholar] [CrossRef]
Aiazzi, B.; Alparone, L.; Baronti, S.; Garzelli, A. Context-driven fusion of highspatial and spectral resolution images based on oversampled multiresolution analysis. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2300–2312. [Google Scholar] [CrossRef]
Amro, I.; Mateos, J. Multispectral image pansharpening based on the contourlet transform. Inf. Opt. Photonics 2010, 206, 247–261. [Google Scholar]
Amro, I.; Mateos, J. General shearlet pansharpening method using Bayesian inference. In Proceedings of the 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 26–28 September 2013; pp. 231–235. [Google Scholar]
Upla, P.K.; Gajjar, P.P.; Joshi, M.V. Pan-sharpening based on non-subsampled contourlet transform detail extraction. In Proceedings of the 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), Jodhpur, India, 18–21 December 2013; pp. 1–4. [Google Scholar]
Wang, X.H.; Bai, S.F.; Li, Z.; Song, R.X.; Tao, J.Z. The PAN and MS image pansharpening algorithm based on adaptive neural network and sparse representation in the NSST domain. IEEE Access 2019, 7, 52508–52521. [Google Scholar] [CrossRef]
Zhang, L.; Shen, H.; Gong, W.; Zhang, H. Adjustable model-based fusion method for multispectral and panchromaticimages. IEEE Trans. Syst. Man Cybern. 2012, 42, 1693–1704. [Google Scholar] [CrossRef]
Shen, H.; Meng, X.; Zhang, L. An integrated framework forthe spatio-temporal-spectral fusion of remote sensing images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7135–7148. [Google Scholar] [CrossRef]
Yang, J.F.; Fu, X.Y.; Hu, Y.W.; Huang, Y.; Ding, X.H.; Paisley, J. PanNet: A deep network architecture for Pan-Sharpening. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1753–1761. [Google Scholar]
Huang, W.; Xiao, L.; Wei, Z.; Liu, H.; Tang, S. A new pansharpeningmethod with deep neural networks. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1037–1041. [Google Scholar] [CrossRef]
Masi, G.; Cozzolino, D.; Verdoliva, L.; Scarpa, G. Pansharpeningby convolutional neural networks. Remote Sens. 2016, 8, 594. [Google Scholar] [CrossRef]
Liu, J.M.; Feng, Y.Q.; Zhou, C.S.; Zhang, C.X. PWNet: An adaptive weight network for the fusion of panchromatic and multispectral Images. Remote Sens. 2020, 12, 2804. [Google Scholar] [CrossRef]
Starck, J.L.; Murtagh, F.; Fadili, J.M. Sparse Image and Signal Processing: Wavelets and Related Geometric Multiscale Analysis; Cambridge University Press: New York, NY, USA, 2015. [Google Scholar]
Lang, M.; Guo, H.; Odegard, J.E.; Burrus, C.S.; Wells, R.O. Noise reduction using an undecimated discrete wavelet transform. IEEE Signal Process. Lett. 2016, 3, 10–12. [Google Scholar] [CrossRef]
Cunha, D.; Arthur, L.; Zhou, J.; Do, M.N. The nonsubsampled contourlet transform: Theory, design, and applications. IEEE Trans. Image Process. 2006, 15, 3089–3101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Asley, G.E.; Labate, D.; Lim, W.Q. Sparse directional image representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 2008, 25, 25–46. [Google Scholar] [CrossRef]
Yang, Y.; Dai, M.; Zhou, L.Y. Fusion of infrared and visible images based on NSUDCT. Infrared Laser Eng. 2012, 43, 961–966. [Google Scholar]
Qu, Z.; Xing, Y.Q.; Song, Y.F. An image enhancement method based on non-subsampled shearlet transform and directional information measurement. Information 2018, 9, 308. [Google Scholar] [CrossRef]
Li, J.Y.; Zhang, C.Z. Blind watermarking scheme based on schur decomposition and non-subsampled contourlet transform. Multimed. Tools Appl. 2020, 79, 30007–30021. [Google Scholar] [CrossRef]
Kong, W.W.; Miao, Q.G.; Lei, Y. Multimodal sensor medical image fusion based on local difference in non-subsampled domain. IEEE Trans. Instrum. Meas. 2019, 68, 938–951. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Guided image filtering. IEEE Trans. Softw. Eng. 2013, 35, 1397–1409. [Google Scholar] [CrossRef]
Guo, L.; Chen, L.; Chen, C.P.; Zhou, J. Integrating guided filter into fuzzy clustering for noisy image segmentation. Digit. Signal Process. 2018, 83, 235–248. [Google Scholar] [CrossRef]
Liu, S.; Hu, Q.; Tong, X.; Xia, J.; Du, Q.; Samat, A.; Ma, X. A multi-scale superpixel-guided filter feature extraction and selection approach for classification of very-high-resolution remotely sensed imagery. Remote Sens. 2020, 12, 862. [Google Scholar] [CrossRef]
Ch, M.M.I.; Riaz, M.M.; Iltaf, N.; Ghafoor, A.; Ali, S.S. A multifocus image fusion using highlevel DWT components and guided filter. Multimed. Tools Appl. 2020, 79, 12817–12828. [Google Scholar] [CrossRef]
Li, Z.G.; Zheng, J.H.; Zhu, Z.J.; Yao, W.; Wu, S.; Wu, S. Weighted guided image filtering. IEEE Trans. Image Process. 2015, 24, 120–129. [Google Scholar] [PubMed]
Guo, K.; Labate, D.; Lim, W.Q.; Labate, D.; Weiss, G.; Wilson, E. Wavelets with composite dilations and their MRA properties. Appl. Comput. Harmon. Anal. 2006, 20, 202–236. [Google Scholar] [CrossRef] [Green Version]
Kutyniok, G.; Labate, D. Shearlets: Multiscale Analysis for Multivariate Data; Sprinter Science + Business Media, LLC.: New York, NY, USA, 2012. [Google Scholar]
Vivone, G.; Restaino, R.; Dalla Mura, M.; Licciardi, G.; Chanussot, J. Contrast and error-based fusion schemes for multispectral image pansharpening. IEEE Geosci. Remote Sens. Lett. 2014, 11, 930–934. [Google Scholar] [CrossRef]
Guo, K.; Kutyniok, G.; Labate, D. Sparse multidimensional representations using anisotropic dilation and shear operators. In Proceedings of the International Conference on the Interaction between Wavelets & Splines; Nashboro Press: Brentwood, TN, USA, 2006; pp. 189–201. [Google Scholar]
Palubinskas, G. Fast, simple and good pan-sharpening method. J. Appl. Remote Sens. 2013, 7, 1–12. [Google Scholar] [CrossRef]
Alparone, L.; Wald, L.; Chanussot, J.; Thomas, C.; Gamba, P.; Bruce, L.M. Comparison of pansharpening algorithms: Outcome of the 2006 GRS-S data-fusion contest. IEEE Trans. Geosci. Remote Sens. 2007, 45, 3012–3021. [Google Scholar] [CrossRef]
Vivone, G.; Alparone, L.; Chanussot, J.; Mura, M.D.; Garzelli, A.; Licciardi, G.A.; Restaino, R.; Wald, L. A critical comparison among pansharpening algorithms. IEEE Trans. Geoence Remote Sens. 2015, 53, 2565–2586. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Z. Simultaneous image fusion and denoising with adaptive sparse representation. Image Process. Iet. 2019, 9, 347–357. [Google Scholar] [CrossRef]
Li, H.; Wu, X.J.; Kittler, J. Infrared and visible image fusion using a deep learning framework. In Proceedings of the 2018 24rd International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 2705–2710. [Google Scholar]
Wald, L. Quality of high resolution synthesised images: Is there a simple criterion? In Proceedings of the third conference Fusion of Earth data: Merging point measurements, raster maps and remotely sensed images, Nice, France, 26–28 January 2000; SEE/URISCA; pp. 99–103. [Google Scholar]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
Shao, Z.; Cai, J. Remote sensing image fusion with deep convolutional neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1656–1669. [Google Scholar] [CrossRef]
Luciano, A.; Bruno, A.; Stefano, B.; Andrea, G.; Filippo, N.; Massimo, S. Multispectral and panchromatic data fusion assessment without reference. ASPRS J. Photogramm. Eng. Remote Sens. 2008, 74, 193–200. [Google Scholar]

Figure 1. The sub-bands obtained by MS and PAN image after 2 layers decomposition by NSST. (a) Original image. (b) Low frequency sub-band. (c) High frequency sub-band first scale direction one. (d) High frequency sub-band first scale direction two. (e) High frequency sub-band second scale direction one. (f) High frequency sub-band second scale direction two. (g) High frequency sub-band second scale direction three. (h) High frequency sub-band second scale direction four.

Figure 2. MS and PAN data sets.

Figure 3. The direction of the template.

Figure 4.

3 \times 3

adjacent region elements vector diagram.

Figure 4.

3 \times 3

adjacent region elements vector diagram.

Figure 5. The gradient edge detail images of MS and PAN. (a) MS. (b) MS gradient edge detail. (c) PAN. (d) PAN gradient edge detail.

Figure 6. Statistical display of regional edge detection function distribution in RS image.

Figure 7. Comparison of GIF and DCGIF models extracting spatial detail information from PAN image.

Figure 8. The flow chart of the reconstruction of MS image component I.

Figure 9. Four directions of high frequency sub-band

3 \times 3

neighborhood.

Figure 9. Four directions of high frequency sub-band

3 \times 3

neighborhood.

Figure 10. The flow chart of final texture details information.

Figure 11. The influence of different values of window radius on DCGIF model.

Figure 12. Flow chart of MS-Pansharpening algorithm in this paper.

Figure 13. Experimental data set.

Figure 14. MS and Degraded data comparison experiment results.

Figure 15. Real Data comparison experiment results.

Table 1. Statistical results of directional neighborhood correlation and eight neighborhoods correlation of the high frequency sub-bands.

			San Clement Area Image		Sydney Area Image		San Francisco Area Image
			MS	PAN	MS	PAN	MS	PAN
First-scale	Sub-band 1	$a$	32,196	32,217	32,286	32,721	32,938	32,570
		$b$	7804	7783	7714	7279	7062	7430
		$a / (a + b) (%)$	80.49	80.54	80.72	81.80	82.35	81.43
	Sub-band 2	$a$	32,596	32,986	33,394	33,643	33,987	33,846
		$b$	7404	7014	6606	6357	6013	6154
		$a / (a + b) (%)$	81.49	82.47	83.49	84.11	84.97	84.62
Second-scale	Sub-band 1	$a$	34,338	34,551	34,730	34,790	35,177	34,627
		$b$	5662	5449	5270	5210	4823	5373
		$a / (a + b) (%)$	85.85	86.38	86.83	86.98	87.94	86.57
	Sub-band 2	$a$	32,509	32,630	31,260	30,672	33,262	31,877
		$b$	7491	7370	8740	9328	6738	8123
		$a / (a + b) (%)$	81.27	81.58	78.15	76.68	83.15	79.69
	Sub-band 3	$a$	34,263	34,232	35,190	35,462	35,988	35,091
		$b$	5737	5768	4810	4538	4012	4909
		$a / (a + b) (%)$	85.66	85.58	87.98	88.66	89.97	87.73
	Sub-band 4	$a$	31,486	30,936	30,715	30,660	31,708	30,171
		$b$	8514	9064	9285	9340	8292	9829
		$a / (a + b) (%)$	78.72	77.34	76.79	76.65	79.27	75.43

Note:

a

is the number of coefficients in which the directional neighborhood correlation is stronger than the eight neighborhood correlations in the high frequency subband;

b

is the number of coefficients whose eight neighborhood correlations is stronger than the directional neighborhood correlation;

a / (a + b)

is the proportion of directional neighborhood correlation stronger than the eight neighborhood correlations.

Table 2. Objective evaluation of DCGIF filtering results with different window radius.

Evaluation Index	$r = 1$	$r = 2$	$r = 4$	$r = 8$
RMSE	23.8240	30.5323	34.8185	37.8988
SAM	0.3917	0.5453	0.6952	0.8574

Table 3. Correlation statistics between neighboring bands of MS.

Correlation Coefficient	MS-Test-1	MS-Test-2	MS-Test-3	MS-Test-4	MS-Test-5	MS-Test-6	MS-Test-7	MS-Test-8
$r_{12}$	0.9686	0.9891	0.9738	0.9853	0.9235	0.9611	0.9712	0.9792
$r_{23}$	0.9681	0.9930	0.9850	0.9894	0.9626	0.9868	0.9673	0.9880

Table 4. Objective evaluation of degraded data.

Data Set	Evaluation Index	IHS	GIF	ATWT	AWLP	ASR	Deep Learning	NSST + SR + PCNN	NSST + DCGIF
Test-RS-1	RMSE	46.1822	36.9231	21.8892	22.0637	31.0385	24.1516	19.3740	16.6651
	ERGAS	20.8551	16.6699	9.8857	9.9650	14.0130	10.9012	8.7466	7.5239
	CC	0.8357	0.9408	0.9617	0.9612	0.9408	0.9617	0.9694	0.9776
	SSIM	0.5867	0.5276	0.8008	0.8018	0.6706	0.7981	0.7950	0.8751
	SAM	4.3361	1.0508	2.7560	4.3361	1.0067	1.4456	1.3041	0.3463
	RASE	41.7080	33.3459	19.7685	19.9261	28.0314	21.8117	17.4970	15.0506
Test-RS-2	RMSE	42.7084	44.1567	15.7518	15.7411	28.7240	24.4196	19.5112	15.6839
	ERGAS	32.2701	33.1070	11.9120	11.8041	21.4947	18.3585	14.7141	11.8500
	CC	0.8919	0.7885	0.9712	0.9725	0.9272	0.9522	0.9504	0.9714
	SSIM	0.7414	0.7176	0.9358	0.9381	0.8171	0.8834	0.8905	0.9394
	SAM	5.0471	3.5388	5.9898	3.5308	3.3412	3.6489	3.2974	0.3748
	RASE	64.1366	66.3117	23.6551	23.6389	43.1359	36.6171	29.3007	23.5530
Test-RS-3	RMSE	41.2275	44.7460	18.4704	18.6100	29.2235	21.4150	20.3299	17.5247
	ERGAS	18.5383	20.1201	8.2854	8.3358	13.1382	9.6341	9.1409	7.8784
	CC	0.6902	0.8138	0.9431	0.9423	0.8389	0.9121	0.9267	0.9467
	SSIM	0.7186	0.5209	0.8623	0.8620	0.7659	0.8807	0.8285	0.8792
	SAM	3.2074	1.4370	2.5378	2.2618	2.0981	2.4695	2.0469	0.3035
	RASE	37.0416	40.2028	16.5950	16.7205	26.2563	19.2407	18.2657	15.7453
Test-RS-4	RMSE	37.2295	38.0868	18.9726	18.9917	27.3476	21.6457	19.0765	16.8401
	ERGAS	17.6893	18.7514	9.8712	9.8679	12.7405	10.2940	9.7367	8.7608
	CC	0.9296	0.8798	0.9695	0.9695	0.9499	0.9682	0.9657	0.9745
	SSIM	0.8131	0.7847	0.9329	0.9332	0.8654	0.9277	0.9202	0.9515
	SAM	4.6999	3.5359	5.9898	4.9606	3.3908	5.3534	5.0577	0.3096
	RASE	35.2712	37.5283	19.7080	19.7073	25.5687	20.4844	19.4514	17.5329

Table 5. Objective evaluation of real data.

Evaluation Index	IHS	GIF	ATWT	AWLP	ASR	Deep Learning	NSST + SR + PCNN	NSST + DCGIF
$Q N R$	0.7561	0.2979	0.8765	0.8778	0.7765	0.8384	0.8541	0.8780
$D_{λ}$	0.0119	0.0073	0.0105	0.0092	0.0043	0.0037	0.0104	0.0109
$D_{S}$	0.2349	0.6999	0.1142	0.1140	0.2202	0.1584	0.1370	0.1123

Table 6. Time complexity statistics of different fusion algorithms.

Data Set	Time(s)
Data Set	IHS	GIF	ATWT	AWLP	ASR	Deep Learning	NSST + SR + PCNN	NSST + DCGIF
Test-RS-1	1.50	2.45	0.61	1.00	202.38	4.06	110.07	44.74
Test-RS-2	1.42	3.04	0.95	0.96	207.59	3.69	122.39	46.13
Test-RS-3	1.55	2.34	0.59	0.60	218.66	4.13	109.54	45.85
Test-RS-4	1.60	2.27	0.58	0.64	222.32	3.94	124.96	51.44
Test-RS-5	1.55	2.29	0.60	0.62	225.18	3.92	137.39	43.79

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Mu, Z.; Bai, S.; Feng, Y.; Song, R. MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering. Remote Sens. 2022, 14, 4867. https://doi.org/10.3390/rs14194867

AMA Style

Wang X, Mu Z, Bai S, Feng Y, Song R. MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering. Remote Sensing. 2022; 14(19):4867. https://doi.org/10.3390/rs14194867

Chicago/Turabian Style

Wang, Xianghai, Zhenhua Mu, Shifu Bai, Yining Feng, and Ruoxi Song. 2022. "MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering" Remote Sensing 14, no. 19: 4867. https://doi.org/10.3390/rs14194867

APA Style

Wang, X., Mu, Z., Bai, S., Feng, Y., & Song, R. (2022). MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering. Remote Sensing, 14(19), 4867. https://doi.org/10.3390/rs14194867

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MS-Pansharpening Algorithm Based on Dual Constraint Guided Filtering

Abstract

1. Introduction

2. Related Works

2.1. Guided Image Filter

2.2. NSST and Its Sub-Band Properties

3. Methodology

3.1. Construction of the Dual Constraint Guided Filtering Model

3.1.1. Neighborhood Elements Vector and Average Gradient Correlation Constraint

3.1.2. Region Edge Recognition Function Based on the Neighborhood Elements Vector Angle and the Average Gradient

3.1.3. Model Construction

3.2. MS-Pansharpening Algorithm Based on DCGIF in NSST Domain

3.2.1. Reconstruction of MS Intensity Component Guided by High Frequency Information of NSST from PAN

3.2.2. Extraction of Decision Texture Details for PAN Based on DCGIF Model

3.2.3. Detail Information Injection Model for MS Based on Spectral Correlation

3.2.4. The Process of Algorithm Implementation

4. Experiment and Analysis

4.1. Data Set

4.2. Degraded Data Experiment

4.3. Real Data Experiment

4.4. Computation Complexity Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI