A Segmentation-Cooperated Pansharpening Method Using Local Adaptive Spectral Modulation

: In order to improve the spatial resolution of multispectral (MS) images and reduce spectral distortion, a segmentation-cooperated pansharpening method using local adaptive spectral modulation (LASM) is proposed in this paper. By using the k-means algorithm for the segmentation of MS images, di ﬀ erent connected component groups can be obtained according to their spectral characteristics. For spectral information modulation of fusion images, the LASM coe ﬃ cients are constructed based on details extracted from images and local spectral relationships among MS bands. Moreover, we introduce a cooperative theory for the pansharpening process. The local injection coe ﬃ cient matrix and LASM coe ﬃ cient matrix are estimated based on the connected component groups to optimize the fusion result, and the parameters of the segmentation algorithm are adjusted according to the feedback from the pansharpening result. In the experimental part, degraded and real data sets from GeoEye-1 and QuickBird satellites are used to assess the performance of our proposed method. Experimental results demonstrate the validity and e ﬀ ectiveness of our method. Generally, the method is superior to several classic and state-of-the-art pansharpening methods in both subjective visual e ﬀ ect and evaluation indices, achieving a balance between the injection of spatial details and maintenance of spectral information, while e ﬀ ectively reducing the spectral distortion of the fusion image.


Introduction
With the continuous progress of satellite and sensor technology, remote sensing image data with high spectral and spatial resolution can be acquired simultaneously. The spectral resolution can reach the nanometer level, while the spatial resolution can reach the submeter level. However, due to transmission bottlenecks and signal-to-noise ratio (SNR) limitations [1], the acquired remote sensing data have complementary characteristics, such as low-resolution multispectral (LRMS) images obtained at the expense of spatial resolution and panchromatic (PAN) images with lower spectral resolution and higher spatial resolution. To achieve high-resolution multispectral (HRMS) images, image fusion technology is required. By merging PAN images with LRMS images, the complementary information of the two can be integrated and the redundant information can be removed. This process is usually called pansharpening [2], which, as one branch of image fusion, aims to obtain HRMS images by injecting spatial details from PAN into LRMS. Furthermore, more accurate scene descriptions and more reliable interpretations can be provided.
Many studies have put forward theories on and methods for pansharpening. Component substitution (CS) methods and multiresolution analysis (MRA) methods are widely used traditional in the process of coefficient injection, because the spatial structure differences between PAN and MS and the relationship between MS bands are ignored, there are still problems of injection bias and spectral distortion, or manual intervention is needed in the process of pansharpening.
In this paper, in order to improve the fusion quality of MS and PAN images while reducing the spectral distortion, a pansharpening method based on local adaptive spectral modulation (LASM) and cooperation with segmentation is proposed. This method has an adaptive spectral modulation system and can adjust the segments according to the fusion feedback. In this method, the k-means algorithm is used to segment MS images to obtain connected component groups with similar spectral characteristics. The local injection coefficient matrix is estimated based on each group. MTF-GLP technology is used to extract the spatial details of MS and PAN images. To modulate the spectral information in the fusion result, LASM coefficients are constructed based on extracted image details and the spectral relationship between MS bands. By measuring the distance between fused HRMS images and upsampled LRMS images, the optimal number of connected component groups is adaptively selected to make the spectral features of the fused image as close as possible to the original LRMS image. Through the cooperation between fusion and segmentation, the local injection coefficient matrix and LASM coefficient matrix are estimated based on the connected component groups to optimize the pansharpening result, and the parameters of the segmentation algorithm are adjusted according to feedback from the fusion image. Finally, experimental results on GeoEye-1 and QuickBird satellite data sets show that the proposed pansharpening method can effectively enhance the spatial detail information and reduce the spectral distortion of fusion images.
This paper is organized as follows: The second part introduces the pansharpening problem and the key technology of pansharpening. The proposed LASM and the cooperative approach between pansharpening and segmentation are presented in the third part. A performance comparison and analysis are provided in the fourth part by experimental results on degraded and real image data from different satellites. The final part presents the study's conclusions.

Model for the Pansharpening Problem
The pansharpening problem of MS and PAN images needs a fusion model to achieve a balance between injecting spatial details and preserving spectral information. This model can be either a global model based on the whole image or a local model based on the image context, such as spectral [30] or spatial [31] information.
Fusion of MS and PAN images yields a high-spatial-resolution MS imageMS = M S k k=1,··· ,N . While maintaining the spectral content of the MS image, the spatial details of the PAN image are injected into MS bands so that the fusion image achieves spectral diversity and the spatial resolution is the same as the original MS and PAN images. The definition ofMS k iŝ where N is the number of MS bands, MS k is the k-th band of the MS image upsampled to meet the PAN image size, g k represents the k-th band of the injection gain matrix, D k denotes the detail image of band k extracted from the PAN image, • represents the pixel-by-pixel multiplication operation between the injection coefficient matrix and the detail image, and D k is obtained by subtracting the corresponding low-resolution version of the PAN image from the histogram-matched PAN image. Most existing fusion models only consider modulation coefficient estimation for the spatial detail part, but some fusion models add coefficient modulation for the spectral part. In literatures [32,33], the spectral modulation coefficients are introduced into the pansharpening methods, and thus the spectral information of the MS image can be better preserved. Based on this, the fusion model can be expressed asM where α k denotes the spectral modulation coefficient matrix for the k-th band. In this paper, the MRA-based scheme is adopted. The primary steps of the MRA-based pansharpening approach include the following: first, the original MS image is interpolated to obtain the upscaled MS image with the same size as the PAN image, then the low-resolution version of the PAN image is calculated by multiscale decomposition, and the corresponding injection gain matrix and spectral modulation coefficient matrix are calculated. Finally, spectral modulation and detail injection are completed according to Equation (2) to obtain the fusion image. The key technologies during the pansharpening process-detail image estimation, injection coefficient construction, and spectral modulation coefficient construction-are detailed below.

Detail Image Estimation
Since the relationship between the low-resolution version of the PAN image and the MS bands is non-linear, the weighted sum of MS bands cannot properly describe low-resolution PAN images with different land covers. However, the estimation of low-resolution PAN images directly affects the extraction of image details, so the MRA-based multiscale decomposition method is used in this paper to calculate the low-resolution version of PAN images.
The performance of the MRA-based method can be improved by frequency analysis of images according to the filter whose frequency response amplitude matches the MTF of the imaging system. A clearer geometric structure can be produced through this method than with an ideal MRA filter. In fact, the spatial frequency response of the Gaussian filter can be adjusted to match the MTF of the sensor. In this way, we can extract the detail information from the PAN image that cannot be obtained by the MS sensors because of the coarse spatial resolution.
MTF-GLP technology is used to calculate the detailed images [28] in this paper. Before the introduction of multiresolution wavelet analysis, Bert and Adelson [10] proposed the Laplacian pyramid (LP), which is a band-pass image decomposition method based on Gaussian pyramid (GP). The construction process of a Laplace image is as follows: perform low-pass filtering and downsampling of the original image to obtain an approximate image with coarse scale, that is, the low-pass approximate image can be obtained by decomposition. Then the approximate image after interpolation and filtering is subtracted from the original image, which is equivalent to band-pass filtering. The next level of decomposition is carried out on the obtained low-pass approximate image, and the multiscale decomposition is completed iteratively. This method has been proved to be suitable for the fusion of remote sensing images [14].
First, we calculate the low-resolution version P LP of PAN image P where h k denotes the linear time-invariant filter and * represents the convolution operation. The frequency response of h k approximates Gaussian shape and matches the gain at Nyquist cutoff frequency of the exact MTF of the sensor that acquires the k-th MS band [15]. We subtract the low-resolution image P LP k from the PAN image P k to yield the spatial detail image where P k is the PAN image after equalization according to MS k and P LP k denotes the low-resolution version of P k .

Injection Coefficient Construction
Spectral characteristics will change according to different objects, regions, or environments [34], and therefore the spectral relationship of MS and PAN images is unfixed. Therefore, if we inject spatial details obtained with the PAN image into each MS band without considering the differences between local areas, the fusion quality of spectral and spatial aspects will be affected. It is a very important  step to estimate the appropriate injection coefficient matrix, and then the spatial details obtained  with the PAN image can be weighted by their respective coefficients and injected into each MS band. A regression-based model in literature [28] was employed to estimate the injection coefficients in this paper. In literature [28], regression analysis between low-resolution PAN images and MS bands was employed and expanded; the injection coefficient estimation based on image regions composed of pixels with similar spectral characteristics was performed. In this case, the injection coefficient matrix is calculated by in which Cov(·, ·) denotes the covariance and Var(·) is the variance. According to Equation (5), the locally implemented expression on the local region is where R MS p and R P p represent the connected component groups containing pixel p in images MS k and P LP k , respectively. Since all pixels in the same connected component group will adopt the same gain coefficient after localization, we can make an adaptive adjustment to the injection weights of detail information guided by the spectral characteristics in the local region.
In this paper, k-means algorithm is used to segment MS images into connected component groups according to spectral characteristics, and local injection coefficients are estimated based on each connected component group.

Construction of Spectral Modulation Coefficient
According to literature [32], when extracting spatial information from PAN images, certain spatial details contained in the MS images should be removed, so they introduced a spectral modulation (SM) scheme for the fusion model that utilizes the Gaussian function to obtain the specific spatial details from the PAN and MS images, then SM coefficients can be constructed by removing the details in MS from the details in PAN. Desirable results for the preservation of spectral information have been obtained by this method. The calculation expression of the SM coefficient is G(x, y; σ) = 1 2πσ 2 exp(− in which P represents the PAN image, I = 1 N N k=1 MS k denotes the intensity component of the MS image, N is the number of MS channels, and Max(x, y) = max MS k (x, y)} is the calculation of the maximum value of the corresponding MS band including the pixel point (x, y). The Gaussian convolution G(x, y; σ) denotes a low-pass filter, * represents the convolution calculation, and σ represents the scale factor in the Gaussian function.
In this paper, an improved spectral modulation approach based on SM is proposed to preserve spectral information.

Proposed Method
In this paper, the proposed pansharpening method mainly focuses on two parts: constructing LASM coefficient and introducing cooperation with segmentation into pansharpening.

LASM Coefficient Construction
Using the SM scheme proposed in literature [32] as a guide, in this paper, the MTF-GLP filter is used to replace the Gaussian filter used in Equation (7) to construct the spatial details of PAN and MS images. The calculation of PAN details is shown in Equations (3) and (4). MS details can be obtained by where MS LM k denotes the low-resolution version of the MS image. The detail image D MS k can be achieved by subtracting its low-resolution image from the MS image MS k . As we know, the fusion image needs to be able to reproduce the spectral features of the original MS image, including spectral features in a single band and the relationships among MS bands. In the fusion process, spatial enhancement is correlated with the extraction and injection of spatial structure in the PAN image, while spectral preservation is correlated with the bands and interbands in the MS image. As shown in Equation (7), the SM coefficient α only considers the relationship between PAN spatial details and MS spatial details. In fact, optical remote sensing uses optical systems to collect and record reflected and emitted radiation from ground objects into space, while MS images have multiple bands and cover a wide range of reflectivity, and the same ground object may have significantly different reflection characteristics among different bands. Therefore, spectral preservation needs to consider the interband relationships of MS images [33], that is, the relationship between the MS channels is important to improve the spectral fidelity of the fusion results. Thus, we constructed the LASM coefficient based on the interband relationships of MS images and the spatial structure relationship to modulate the preservation of spectral information. First, the specific definition of adaptive spectral modulation coefficient is where β k represents the spectral contribution ratio of the k-th band to the MS image, which reflects the spectral differences of the pixels among different bands in the image. If the spectral contribution ratio of the k-th band is slightly different, it means that the spectral information of MS inter-bandsis relatively similar, therefore, the corresponding modulation coefficients for the MS band have similar amplitudes. Equations of the LASM coefficients can be easily implemented locally by calculating α k on each connected component group obtained by image segmentation. The LASM coefficient α k of the image region containing pixel p can be calculated as where R D P p and R D MS p represent the connected component groups containing pixel p in detail image D P k and D MS k , respectively; β k is the spectral contribution ratio of the local connected component group; and max R MS p (i, j)} is the maximum value of the corresponding group covering the pixel point (i, j) in the MS image. The pseudo-code for the LASM coefficient construction is summarized in Algorithm 1.
Calculate the LASM coefficient for each connected component group as

Performance Test of LASM
The method based on our proposed LASM was compared with the method without LASM and the method using SM [32] instead of LASM. A set of MS and PAN images from the GeoEye-1 satellite was selected as an example for comparative analysis. The reference image and the fusion results of these three methods are shown in Figure 1. Enlarged local details are shown in the lower left corner of the image. Six commonly used objective evaluation indices were adopted to evaluate the performance of the three methods: correlation coefficient (CC), structural similarity (SSIM), spectral angle mapper (SAM), root mean square error (RMSE), erreur relative globale adimensionnelle de synthèse (ERGAS), and universal image quality index (UIQI). The corresponding results of the evaluation indices are described in Table 1.
Calculate the LASM coefficient for each connected component group as

Performance Test of LASM
The method based on our proposed LASM was compared with the method without LASM and the method using SM [32] instead of LASM. A set of MS and PAN images from the GeoEye-1 satellite was selected as an example for comparative analysis. The reference image and the fusion results of these three methods are shown in Figure 1. Enlarged local details are shown in the lower left corner of the image. Six commonly used objective evaluation indices were adopted to evaluate the performance of the three methods: correlation coefficient (CC), structural similarity (SSIM), spectral angle mapper (SAM), root mean square error (RMSE), erreur relative globale adimensionnelle de synthèse (ERGAS), and universal image quality index (UIQI). The corresponding results of the evaluation indices are described in Table 1. From Figure 1 we can see that the method that uses SM in place of the LASM coefficient [32] has certain spectral distortion compared with the reference image, and excessive detail injection occurs in some local areas. There is no obvious difference in visual effect between the methods with From Figure 1 we can see that the method that uses SM in place of the LASM coefficient [32] has certain spectral distortion compared with the reference image, and excessive detail injection occurs in some local areas. There is no obvious difference in visual effect between the methods with and without LASM. However, as illustrated in Table 1, the fusion result obtained by the method with LASM mostly achieves the best values of indices except the RMSE index. The optimal RMSE is obtained by the fusion method without LASM, and the LASM-based method is slightly lower than the best value. Generally speaking, compared with the method without LASM, the LASM-based method can achieve a better fusion result with improved spectral fidelity and enhanced spatial quality.  Figure 2 shows the spectral horizontal profile curves of the fusion images achieved by the above three methods. From this figure, we can see that there is a large deviation between the curve obtained by the SM-based method and that of the reference image. The curves acquired by the methods with and without LASM are relatively closer to the reference curve. According to the enlarged local details in the rectangular boxes, it is concluded that the highest spectral fidelity can be achieved by the LASM-based fusion method. and without LASM. However, as illustrated in Table 1, the fusion result obtained by the method with LASM mostly achieves the best values of indices except the RMSE index. The optimal RMSE is obtained by the fusion method without LASM, and the LASM-based method is slightly lower than the best value. Generally speaking, compared with the method without LASM, the LASM-based method can achieve a better fusion result with improved spectral fidelity and enhanced spatial quality.  Figure 2 shows the spectral horizontal profile curves of the fusion images achieved by the above three methods. From this figure, we can see that there is a large deviation between the curve obtained by the SM-based method and that of the reference image. The curves acquired by the methods with and without LASM are relatively closer to the reference curve. According to the enlarged local details in the rectangular boxes, it is concluded that the highest spectral fidelity can be achieved by the LASM-based fusion method.

Cooperation with Segmentation Using K-Means
In remote sensing image processing, segmentation and pansharpening are usually regarded as two interrelated steps, but they seldom cooperate with each other. Fusion results cannot be optimized according to the characteristics of segmentation methods, and segmentation results are seldom used to guide pansharpening [35]. Using the idea of cooperation between pansharpening and segmentation, image segmentation is applied to optimize the fusion results, and the parameters of the segmentation algorithm are adjusted according to the feedback from the fusion results. The locality of the pansharpening method based on image segmentation depends on the partitioning of connected component groups.
Many clustering algorithms can be selected in this part. The k-means algorithm is used for image segmentation in this paper, which is a classic clustering method based on partitioning and one of the top 10 data mining algorithms. K-means is based on distance similarity, which partitions according to the similarities between pixels. In order to improve spectral fidelity and reduce spectral distortion, the MS image is segmented by the spectral similarity of pixels. Pixels with the same spectral characteristics are clustered into the same connected component group, and do the same calculation for all the pixels in the same group, so that spatial details can be injected uniformly.
After the image is segmented into connected component groups based on the k-means algorithm, the local injection coefficients and LASM coefficients are calculated for each group, then they are applied to the local connected component groups, weighting the spectral information to be maintained and the spatial details to be injected. When the number of groups falls to one, we regard it as the global-based method.
The MS image is clustered into different pixel subsets S = (s 1 , . . . , s K ) through the k-means algorithm. The cosine of the angle between two vectors is used to measure the differences between individuals. Therefore, the distance function J is defined as in which J is the cosine similarity between M j and µ t , which denotes the correlation measure; µ t is the pixel mean value of the connected component group S t ; M j = (m 1 j , . . . , m N j ) is the spectral channel vector of pixel j; N is the number of MS channels; K denotes the clustering number, and ||·|| represents the calculation of the modulus of vectors.
The value of K in the k-means algorithm is the key parameter to obtain satisfactory segmentation results, but it depends on the different earth coverings of the MS image. In this paper, a method of cooperation between pansharpening and segmentation is proposed, which uses the fusion result to guide the adaptive selection of segmentation parameter and optimizes the fusion result based on image segmentation. Figure 3 shows the specific pansharpening process.
Our proposed method contains four major steps: 1. Set the initial value K = 3, then use the random selection algorithm to select the initial focal point among all the pixels. Then use the k-means algorithm to cluster the MS image into K groups according to the spectral similarity measurement, and the PAN image is segmented according to MS segments.

3.
Calculate fusion imageMS by Equation (2). The difference d K between the upscaled MS image and the smoothed fusion image [29] is calculated as where j is the index of the pixel vector in S t , which represents the t-th group of pixels;MS f us denotes the HRMS image after average value filtering; and |·| represents the calculation operation of the element number. 4. K = K+1, K ∈ [3,9]. Repeat steps 1-4, and select the optimal value of K with the minimum difference d K and output the fusion image. Our proposed method contains four major steps: 1. Set the initial value 3 K = , then use the random selection algorithm to select the initial focal point among all the pixels. Then use the k-means algorithm to cluster the MS image into K groups according to the spectral similarity measurement, and the PAN image is segmented according to MS segments.  (13) and (14). The local injection coefficient matrix is obtained by Equation (6). 3. Calculate fusion image  MS by Equation (2). The difference K d between the upscaled MS image and the smoothed fusion image [29] is calculated as where j is the index of the pixel vector in t S , which represents the t -th group of pixels; fus MS denotes the HRMS image after average value filtering; and ⋅ represents the calculation operation of the element number. 4. = +1 K K , ∈ [3,9] K . Repeat steps 1-4, and select the optimal value of K with the minimum difference K d and output the fusion image.
Here is the pseudo-code for the proposed pansharpening algorithm (Algorithm 2) based on cooperation with segmentation: Algorithm 2 Pansharpening algorithm based on cooperation with segmentation Input: Original MS and PAN images, range of segments [3,9] Output: Fused image fus MS

Begin
Interpolate MS to the size of P , yielding  MS Here is the pseudo-code for the proposed pansharpening algorithm (Algorithm 2) based on cooperation with segmentation:

Performance Test of Cooperation with Segmentation
The proposed method based on cooperation with segmentation is compared with the method without segmentation. The example images from the GeoEye-1 satellite selected in Section 3.1.2 are used for the performance analysis in this part. The segmentation result is shown in Figure 4a. Figure 4b,c show the fusion results obtained by the method without segmentation and the method based on cooperation with segmentation, respectively. Select optimal segments L with minimum difference 3 9 min{ , , }

Performance Test of Cooperation with Segmentation
The proposed method based on cooperation with segmentation is compared with the method without segmentation. The example images from the GeoEye-1 satellite selected in Section 3.1.2 are used for the performance analysis in this part. The segmentation result is shown in Figure 4a. Figure  4b,c show the fusion results obtained by the method without segmentation and the method based on cooperation with segmentation, respectively.  As can be seen from Figure 4, the MS image is segmented into five connected component groups according to the spectral characteristics. It was concluded by the fusion results that the method based on cooperation with segmentation can achieve better spectral quality than the method without segmentation. For example, the red roof in the enlarged rectangular box suffers some color distortion. Performance of fusion results of Figure 4b,c are shown in Figure 5; the indices cannot be uniformly normalized, so the broken lines of the first three indices are enlarged for the convenience of comparison. It can be seen from this chart that the fusion result obtained by the method with segmentation achieves optimal values of indices. It shows that the spectral fidelity and the spatial quality are improved effectively by introducing the scheme of cooperation with segmentation. As can be seen from Figure 4, the MS image is segmented into five connected component groups according to the spectral characteristics. It was concluded by the fusion results that the method based on cooperation with segmentation can achieve better spectral quality than the method without segmentation. For example, the red roof in the enlarged rectangular box suffers some color distortion. Performance of fusion results of Figure 4b,c are shown in Figure 5; the indices cannot be uniformly normalized, so the broken lines of the first three indices are enlarged for the convenience of comparison. It can be seen from this chart that the fusion result obtained by the method with segmentation achieves optimal values of indices. It shows that the spectral fidelity and the spatial quality are improved effectively by introducing the scheme of cooperation with segmentation.  K-means clustering is an unsupervised learning approach that tries to find natural categories of sample data. Without any prior knowledge, k-means clusters (or groups) data points with similar characteristics into different regions according to iteration rules. It is a real-time method with rapid convergence, easy implementation, and a simple concept. In most cases, the segmentation results obtained by k-means are satisfactory. The segmentation algorithms mentioned in literature [23] and literature [36] can be successfully applied in image segmentation, but these algorithms are relatively complex and require a large amount of computation. K-means may not always be able to get the global optimal solution, and if an initial value is not satisfactory, we may get a poor segmentation result, which will affect the pansharpening performance. For k-means clustering, we usually repeat the process a certain number of times and then get the best result. Through the cooperation between segmentation and pansharpening, we use the feedback from the fusion results to guide the segmentation process and get the optimal clustering. K-means clustering is an unsupervised learning approach that tries to find natural categories of sample data. Without any prior knowledge, k-means clusters (or groups) data points with similar characteristics into different regions according to iteration rules. It is a real-time method with rapid convergence, easy implementation, and a simple concept. In most cases, the segmentation results obtained by k-means are satisfactory. The segmentation algorithms mentioned in literature [23] and literature [36] can be successfully applied in image segmentation, but these algorithms are relatively complex and require a large amount of computation. K-means may not always be able to get the global optimal solution, and if an initial value is not satisfactory, we may get a poor segmentation result, which will affect the pansharpening performance. For k-means clustering, we usually repeat the process a certain number of times and then get the best result. Through the cooperation between segmentation and pansharpening, we use the feedback from the fusion results to guide the segmentation process and get the optimal clustering.

Experimental Results and Comparisons
Degraded and real data sets from different satellites are used for experiment and analysis in this part. First, an experiment on degraded data set to assess the performance of the proposed method is performed. The fusion results can be compared with the reference images by visual and objective evaluation indices. Second, our proposed method is applied to real data sets for performance evaluation. Seven classic and popular pansharpening methods are used as the comparative methods, including à trous wavelet transform (ATWT) [4], GS [7], MTF matched GLP with context-based decision (MTF-GLP-CBD) [25], BDSD [8], morphological filter-half gradients (MF-HG) [24], GSA-BPT [28], and GSA-HA [29].

Data Sets
(1) GeoEye-1 data set: The experimental data set from the GeoEye-1 satellite is shown in Figure 6a,e,c,g, which are the degraded and real image pairs, respectively. This data set was acquired from Hobart, Australia on 24 February, 2009 and provides 0.5 m PAN and 2 m LRMS images.
(2) QuickBird data set: As shown in Figure 6b,f,d,h, both the degraded and real data sets from the QuickBird satellite are used for reduced-and full-resolution assessment. The QuickBird data set was captured from Sundarbans, India, on 21 November, 2002. The spatial resolution of the PAN and LRMS images is 0.7 m and 2.8 m, respectively.

Quality Indices
Evaluating the performance of the pansharpening methods mainly entails subjective visual analysis and objective evaluation. In this paper, six evaluation indices are selected for the reduced-resolution evaluation of the degraded data sets, and three indices are used for the full-resolution assessment of the real data sets.
(1) Reduced-resolution assessment: The six indices are: CC [37], SSIM [38], SAM [39], RMSE [40], ERGAS [3], and UIQI [41]. CC and SAM are used to measure spectral quality. Three indices account for spatial quality: SSIM, RMSE, and ERGAS. SSIM reflects the structural similarity The data sets collected from the GeoEye-1 and QuickBird satellites consist of one PAN band and four MS bands (R, G, B and NIR), but only R, G, and B bands are used for visual display. In the experiment, because there were no corresponding reference images in the data sets to evaluate the pansharpening performance, the original PAN and LRMS images were processed with MTF filtering and decimation to obtain the degraded data for fusion, hence the original LRMS image can be adopted as the reference. The degraded and real data from GeoEye-1 and QuickBird satellites were used to measure the performance of the pansharpening methods. The sizes of the PAN and LRMS images used in this paper are 256 × 256 and 64 × 64, respectively. The size of the reference image is 256 × 256.

Quality Indices
Evaluating the performance of the pansharpening methods mainly entails subjective visual analysis and objective evaluation. In this paper, six evaluation indices are selected for the reduced-resolution evaluation of the degraded data sets, and three indices are used for the full-resolution assessment of the real data sets.
(1) Reduced-resolution assessment: The six indices are: CC [37], SSIM [38], SAM [39], RMSE [40], ERGAS [3], and UIQI [41]. CC and SAM are used to measure spectral quality. Three indices account for spatial quality: SSIM, RMSE, and ERGAS. SSIM reflects the structural similarity between the reference image and the fusion image, while RMSE and ERGAS represent the difference between the two. UIQI is a global index used to measure spatial and spectral qualities.
(2) Full-resolution assessment: For the real data experiment, due to the lack of corresponding reference images, the quality with no reference (QNR) index [42] and two independent indices, the spectral distortion index D λ and the spatial distortion index D s , are applied to the quality assessment. QNR = (1 − D λ ) × (1 − D s ), consisting of D λ and D s , is a global measurement of correlation, luminance, and contrast between two images.

Experiments on Degraded Data
As Figure 6 shows, two groups of degraded images from different satellites are used to test the proposed method. Figure 6a,e are the first group of images from the GeoEye-1 satellite data set, and 6b and f are the second group of images from the QuickBird satellite data set. The fusion results of these two groups of degraded data sets for the proposed pansharpening method and the comparison methods are shown in Figure 7; Figure 8, respectively. The first image in Figures 7 and 8 is the reference image. The objective evaluation results are listed in Tables 2 and 3.
By analyzing the fusion results in terms of subjective visual effects, it is easy to see that GS and MTF-GLP-CBD have relatively serious spectral distortion and blurred details. The fusion results obtained by ATWT, MF-HG, and GSA-HA maintain the spectral characteristics well, and the spatial details are relatively clear. The fusion results achieved by BDSD and GSA-BPT have excessive enhancement of some edge details compared to the reference image, which leads to certain spectral distortion. By comparison, the proposed method shows better agreement with the reference image in terms of both spatial and spectral characteristics.
As listed in Tables 2 and 3, the evaluation indices of the fusion images from different approaches that are tested on the degraded image pairs can be compared. According to the objective quantitative results in Table 2, all the objective indices obtained by our proposed method, with the exception of SAM, are superior to those of the other seven methods. We obtained the suboptimal value of SAM; the optimal value was acquired by MF-HG. In Table 3, our method achieved the best value on five indices: CC, SSIM, RMSE, ERGAS, and UIQI. As the color in this group of images is relatively single, the minimum value of SAM is obtained by GS, and our proposed method achieves a near-optimal value. As mentioned in literature [43], evaluation of pansharpening mainly has two aspects. One is the injection of spatial information, which mainly represents the improvement of image spatial resolution. The other is the preservation of spectral information, which refers to the degree of damage to the original spectrum. Normally we cannot obtain both, that is, the injection of spatial details and the preservation of spectral information cannot both be superior, and some compromise may be required. Generally speaking, when the fusion results have better spatial quality, there will be some compromise in spectral maintenance. enhancement of some edge details compared to the reference image, which leads to certain spectral distortion. By comparison, the proposed method shows better agreement with the reference image in terms of both spatial and spectral characteristics. enhancement of some edge details compared to the reference image, which leads to certain spectral distortion. By comparison, the proposed method shows better agreement with the reference image in terms of both spatial and spectral characteristics. As listed in Tables 2 and 3, the evaluation indices of the fusion images from different approaches that are tested on the degraded image pairs can be compared. According to the objective quantitative results in Table 2, all the objective indices obtained by our proposed method, with the exception of SAM, are superior to those of the other seven methods. We obtained the suboptimal value of SAM; the optimal value was acquired by MF-HG. In Table 3, our method achieved the best value on five indices: CC, SSIM, RMSE, ERGAS, and UIQI. As the color in this group of images is relatively single, the minimum value of SAM is obtained by GS, and our proposed method achieves a near-optimal value. As mentioned in literature [43], evaluation of pansharpening mainly has two aspects. One is the injection of spatial information, which mainly represents the improvement of image spatial resolution. The other is the preservation of spectral information, which refers to the degree of damage to the original spectrum. Normally we cannot obtain both, that is, the injection of spatial details and the preservation of spectral information cannot both be superior, and some compromise may be required. Generally speaking, when the fusion results have better spatial quality, there will be some compromise in spectral maintenance.

Experiments on Real Data
As shown in Figure 6c,g,d,h, two groups of real data sets from different satellites were applied to evaluate the performance of the proposed method in practical applications. The first set of images is from the GeoEye-1 satellite, as illustrated in Figure 6c,g, and the second set of real data shown in Figure 6d,h was collected from the QuickBird satellite. Figures 9 and 10 show the fusion images of the comparison methods and the proposed method based on the two groups of real data in Figure 6, respectively. The corresponding objective evaluation indices are given in Tables 4 and 5.
Based on a subjective visual evaluation of Figure 9, it can be seen that the fusion result produced by GS suffers from some blurring and a certain degree of spectral distortion. The fusion result of MTF-GLP-CBD has some intensity distortion and excessive contrast in local areas. The MF-HG fusion result has relatively clear spatial details but slight spectral distortion. BDSD obtained excessive spatial detailsin local areas and results in some color distortion. The fusion result of GSA-BPT suffers from some color degradation and spectral distortionin local areas. The fusion results produced by ATWT, GSA-HA, and our proposed method achieved better visual effects compared with the other methods. of images is from the GeoEye-1 satellite, as illustrated in Figure 6c,g, and the second set of real data shown in Figure 6d,h was collected from the QuickBird satellite. Figures 9 and 10 show the fusion images of the comparison methods and the proposed method based on the two groups of real data in Figure 6, respectively. The corresponding objective evaluation indices are given in Tables 4 and 5.
Based on a subjective visual evaluation of Figure 9, it can be seen that the fusion result produced by GS suffers from some blurring and a certain degree of spectral distortion. The fusion result of MTF-GLP-CBD has some intensity distortion and excessive contrast in local areas. The MF-HG fusion result has relatively clear spatial details but slight spectral distortion. BDSD obtained excessive spatial detailsin local areas and results in some color distortion. The fusion result of GSA-BPT suffers from some color degradation and spectral distortionin local areas. The fusion results produced by ATWT, GSA-HA, and our proposed method achieved better visual effects compared with the other methods.  Figure 10 shows the fusion results from the second group of the real data set. Subjectively, the fusion image of GS has darker colors and suffers from serious spectral distortion. It can be seen that the result from ATWT has relatively clear spatial details, but spectral distortion occurs in some local areas. The results produced by MTF-GLP-CBD and MF-HG have relatively high contrast, which leads to some intensity and spectral distortion. The fusion results of GSA-BPT, GSA-HA, BDSD, and our proposed method have relatively better visual quality. The BDSD result shows clearer spatial details as compared to the other three methods, but it is difficult to distinguish the spectral preservation from the visual effects of these four methods. Tables 4 and 5 list the objective assessment of Figures 9 and 10, respectively. As can be seen from Table 4, the optimal values of the s D and QNR indices are obtained by GSA-HA and ATWT, respectively. The QNR index of our proposed method achieves a suboptimal value, and the value of the s D index obtained by the proposed method is also slightly higher than the optimal one. The best D λ is obtained by our method. In this group of images, although the spectral distortion is reduced, some spatial information is also lost. Due to the introduction of spectral modulation in our method,  Figure 10 shows the fusion results from the second group of the real data set. Subjectively, the fusion image of GS has darker colors and suffers from serious spectral distortion. It can be seen that the result from ATWT has relatively clear spatial details, but spectral distortion occurs in some local areas. The results produced by MTF-GLP-CBD and MF-HG have relatively high contrast, which leads to some intensity and spectral distortion. The fusion results of GSA-BPT, GSA-HA, BDSD, and our proposed method have relatively better visual quality. The BDSD result shows clearer spatial details as compared to the other three methods, but it is difficult to distinguish the spectral preservation from the visual effects of these four methods.
Tables 4 and 5 list the objective assessment of Figures 9 and 10, respectively. As can be seen from Table 4, the optimal values of the D s and QNR indices are obtained by GSA-HA and ATWT, respectively. The QNR index of our proposed method achieves a suboptimal value, and the value of the D s index obtained by the proposed method is also slightly higher than the optimal one. The best D λ is obtained by our method. In this group of images, although the spectral distortion is reduced, some spatial information is also lost. Due to the introduction of spectral modulation in our method, the spatial quality may not be good enough in some cases. Table 5 shows the objective assessment of Figure 10, from which we can see that our proposed method achieves the best values of all three indices.

Conclusions
A pansharpening method based on local adaptive spectral modulation (LASM) construction and cooperation with segmentation is proposed in this paper. The k-means algorithm is used to segment low-resolution multispectral (LRMS) images; pixels with similar spectral characteristics are clustered into the same connected component group, then local injection coefficients are estimated based on the connected component groups. In this paper, we propose LASM for the modulation of spectral information in the fusion result, and the construction of LASM is based on extraction of details from original images and the local spectral relationships between multispectral (MS) bands. After the detail injection and spectral modulation are completed according to the fusion model, the optimal number of segments can be adaptively chosen by measuring the distance between the fusion result and the upsampled LRMS image, then we can make the spectral characteristics of the high-resolution multispectral (HRMS) image as close as possible to those of the original LRMS image. Using the idea of cooperation between pansharpening and segmentation, image segmentation is applied to optimize the fusion result, and the parameters of the segmentation algorithm are adjusted according to the feedback from the fusion image. Finally, experiments on degraded and real data sets from the GeoEye-1 and QuickBird satellites demonstrate the effectiveness and superiority of our proposed method. Compared with seven of the classic and state-of-the-art pansharpening methods, our method has advantages in spatial detail injection and spectral information preservation, while reducing spectral distortion.
However, in some cases, although spectral distortion is reduced, some spatial information is also lost. As for our method, due to its emphasis on spectral preservation, the spatial quality may not be good enough in some cases and its generalization ability remains to be improved. For this paper, segmentation is not discussed as a focus, but only as an optimization aid. In the next step, first we will focus on the influence of spectral characteristics and ground object complexity on pansharpening, and design different fusion strategies according to different image content, and then we will study more segmentation methods that can be used to cooperate with pansharpening and further improve the fusion quality.

Conflicts of Interest:
The authors declare no conflict of interest.