An Improved Version of the Generalized Laplacian Pyramid Algorithm for Pansharpening

: The spatial resolution of multispectral data can be synthetically improved by exploiting the spatial content of a companion panchromatic image. This process, named pansharpening, is widely employed by data providers to augment the quality of images made available for many applications. The huge demand requires the utilization of efﬁcient fusion algorithms that do not require speciﬁc training phases, but rather exploit physical considerations to combine the available data. For this reason, classical model-based approaches are still widely used in practice. We created and assessed a method for improving a widespread approach, based on the generalized Laplacian pyramid decomposition, by combining two different cost-effective upgrades: the estimation of the detail-extraction ﬁlter from data and the utilization of an improved injection scheme based on multilinear regression. The proposed method was compared with several existing efﬁcient pansharpening algorithms, employing the most credited performance evaluation protocols. The capability of achieving optimal results in very different scenarios was demonstrated by employing data acquired by the IKONOS and WorldView-3 satellites.


Introduction
Pansharpening [1][2][3][4][5] has generated growing interest in the last years due to the numerous requests for accurate reproductions of the Earth surface, which pushed researchers to enhance the performance of algorithms based on remotely sensed data. Indeed, pansharpening represents a crucial step in the production of images aimed at visual interpretation in widely exploited software such as Google Earth and Bing Maps. Likewise, many other applications take advantage of this kind of fused data, for instance, agriculture (e.g., for crop type [6] and tree species [7] classification and for precision farming [8]), land cover change detection (e.g., for snow [9], forest [10] and urban [11] monitoring), archaeology [12] and even space mission data analysis [13].
Although the term is often used for a large set of combined data, the pansharpening process more exactly indicates the enhancement of a multispectral (MS) image through fusion with a higher resolution panchromatic (PAN) representing the same scene, as depicted in Figure 1. This setting allows one to obtain a very high quality final product, since the acquisitions can be collected almost contemporaneously from the same platform, thanks to the availability of the two required sensors on many operating satellites.
Several algorithms have been developed for completing the pansharpening process. They can be categorized in different ways according to their main features. In particular, an useful taxonomy that can guide the choice of the user distinguishes two main classes composed of classical and emerging approaches [5]. Essentially, the first group includes the methods which have been developed over the years, starting from the analysis of the physical processes underlying the acquisition of the signals involved. It includes both component substitution (CS) approaches-for example those based on intensity-huesaturation (IHS) [14,15], Gram-Schmidt [16,17] or principal component analysis [18,19] decompositions-and multiresolution analysis (MRA) methods, which use signal decompositions based on box filters, Laplacian pyramids [20][21][22][23], wavelets [24][25][26] and morphological filters [27][28][29]. Instead, the methods contained in the second group are more focused on the optimization of the fusion result, which aim at obtaining the best image quality through the application of more general estimation approaches. Techniques based on variational optimization (VO) approaches [30][31][32] and on machine learning (ML) [33][34][35][36][37] belong to this class. The literature shows several papers in which the results achievable through these approaches are compared [1,5,[38][39][40][41]. As in many other fields, the recent development of more efficient computational approaches has constituted a major breakthrough in data fusion, making feasible the utilization of variational and ML-based methods. More in detail, in the last decade, pioneering works in ML category were compressive sensing and dictionary-based solutions, such as [31,[42][43][44][45][46]. Subsequently, deep learning (DL) approaches became more and more popular in the remote sensing field [47], including pansharpening [33][34][35][36][48][49][50][51][52][53][54][55], and intimately related tasks such as super-resolution [56][57][58][59] and hyper-/multi-spectral data fusion [60,61]. The main issue of the above-mentioned ML approaches is the assumption of a training paradigm relying on a resolution downgrade process (e.g., Wald's protocol). More recently, different training paradigms, mainly based on multi-objective strategies, such as in [62,63], have been proposed to address such a problem.
On the other hand, a careful analysis of the literature testifies that the performance enhancements obtained through recent implementations of classical methods, such as those proposed in [64][65][66], or efficient implementations of VO, such as the one proposed in [30], lead to high quality pansharpened images that do not require extensive training phases.
For this reason, we focus in this paper on possible improvements of these efficient approaches, and in particular, we exploit the classical method scheme that is composed by two successive phases [1]: (i) the extraction of the details from a high resolution PAN image and (ii) the injection of those details into a low resolution MS image. We tackled the investigation of both phases, following the lines traced by the recent literature.
In more detail, we focus on an MRA approach based on the generalized Laplacian pyramid (GLP) [20] with a modulation transfer function (MTF)-matched filter. Namely, the details are extracted from the PAN image through a filter, whose amplitude response is matched to the MTF of the MS sensor [21]. This technique points toward obtaining the most relevant data to the enhancement of the MS image spatial resolution, since it isolates the PAN information that was cut out off by the MS acquisition process. The cited reviews contain a vast list of pansharpening approaches using the MTF-shaped extraction filters. However, the exact form of the MS sensor's MTF is frequently unavailable in practice, due to the lack of accurate and updated on-board measurements of the actual response, which change over time due to the acquisition device aging [67]. Accordingly, good practice consists of estimating the actual shape of the detail-extraction filter directly from data [30]. If the response is significantly different across bands, it is advisable to estimate a different filter for each band [68].
Typically, the subsequent phase of detail injection is completed by adding the image extracted from the PAN image weighted by an injection coefficient matrix, which can in general contain a different entry for each pixel and each band. The values of the specific weights can be derived through physical considerations (as it happens for the HPM method [67,69]) or through mathematical optimization approaches starting from suitable criteria. The projective (or regression-based) injection model [21,70] belongs to the latter type, since it implements the minimum mean square error (MMSE) estimation approach. However, the described linear injection rule does not always represent the optimal choice, as it can also be argued from the studies that propose non-linear approaches, implemented through local methods [15,66,71] or non-linear networks [33,48]. In this work we elaborate on this thesis, while aiming to preserve the computational efficiency of classical methods.
To that end, we adopted a slight generalization of the linear approach that consists of estimating the best polynomial approximation of the optimal relationship between the details extracted from the PAN image and those missing in the MS image. This approach allows one to estimate the optimal injection coefficients, according to the MMSE criterion, through a simple closed formula implementing a multilinear regression (MLR) scheme [29].
The original contribution of this study relies on the definition of a novel fusion architecture. The combination of the filter estimation and the MLR injection approach has been assessed and compared to several existing approaches. We show that with the conjunction of the two techniques, it is possible to obtain remarkable robustness against the diversity of the illuminated scenes, thereby enhancing a key feature of algorithms in practical applications. We tested the proposed method using two different real datasets, acquired from the IKONOS and the WorldView-3 sensors, which allowed us to evaluate the effective performance in different working scenarios. The quality of the final products has been assessed by exploiting both the reduced scale and the full scale assessment protocols [30]. The former allowed us to evaluate the performances of the algorithms in a controlled scenario, where the original MS images were used as references for the fusion of images that were purposely degraded to lower resolution. The latter constituted a realistic scenario in which the MS and PAN images were combined at the original resolution, in the absence of reference images.
In the next section, we describe the problem at hand, define the quantities used in the paper and provide an overview of the proposed approach. The following two sections are devoted to detailed descriptions of the chosen implementations of the two phases that compose the fusion process. In Section 5, we describe the simulation setting and report the outcomes of the experimental tests. The discussion of the results is in Section 6. Finally, conclusions are drawn in Section 7.

Problem Statement
Firstly, we introduce the mathematical notation that will be used in this paper. Bold uppercase, example X, indicates an image. Accordingly, P is used to denote the panchromatic (PAN) image, which is an L r × L c two-dimensional array, where L r and L c are the numbers of rows and columns of the PAN image, respectively. Instead, the multispectral (MS) image is a three-dimensional array with dimensions L r /R × L c /R × B, where B is the number of bands, and R is the resolution ratio between the original MS and the PAN data; it is denoted by M = {M b } b=1,...,B , where M b indicates the b-th spectral band.
The purpose of pansharpening is to identify a near-optimal procedure to obtain L r × L c × B MS data-M = { M b } b=1,...,B -that is characterized by the same spectral resolution of the original MS image M and the same spatial resolution of the PAN image P. We also define an M = { M b } b=1,...,B , the L r × L c × B MS image obtained by solely upsampling M to the PAN scale.
The general formulation of a classical fusion process is given by the following expression [1]: where: i.e., as the differences between the band by band equalized PAN image P b (the equalization is performed as suggested in [27]) and its low-pass filtered version P L b ; are the functions (different for each band b) that inject the PAN details into each MS band.
According to (1), in classical methods each band is treated independently and an additive form is assumed for the injection procedure. The different techniques adopted to compute P L b (and hence the details ∆P b ) and F b [·] identify the specific approaches. More specifically, if P L b is obtained by combining the channels of the interpolated MS image M, the method is said to belong to the CS class, whereas if P L b is obtained by extracting the low-pass part from the PAN image P, the method is said to belong to the MRA class. This distinction is not purely formal, since the two classes can be shown to be characterized by very different visual and quantitative features [72].
Starting from this framework, in this work we propose an architecture, depicted in Figure 2, in which the main steps are the following ones.
(1) Filter estimation. The low-pass filtered PAN image P L b is obtained via an MRA scheme, where the low-pass filter impulse response h b (that can be different for each band b ∈ {1, · · · , B}) is estimated from the data by using semiblind deconvolution [30,68], as will be detailed in Section 3.
(2) Multilinear regression-based injection. The functions F b [·] are supposed to have a nonlinear form that can be approximated by a polynomial expansion of order M, whose coefficients are computed by using a multilinear regression (MLR) approach [29]. The complete description of this procedure will be provided in Section 4.
Indeed, our purpose was to evaluate the joint effect of both filter estimation and MLR-based injection on the final fused product. Therefore, in the following sections, we present in detail our proposal for implementing these two steps.

MRA via Filter Estimation
In this section we present two MRA approaches based on the estimation of the lowpass filter impulse response, namely, the filter estimation (FE) and the multi-band filter estimation (MBFE) algorithms, introduced in [30,68], respectively.
Both these algorithms rely on approaching the estimation problem in the following form. Let x ∈ R N be a single band lexicographic ordered image, where N = L r L c , and let y ∈ R N be its spatially degraded version, obtained via a low-pass filter (LPF) with the finite impulse response h ∈ R N and the addition of noise n ∈ R N . In formulas, we can write the following equation: where C(·) is an operator that generates a block circulant with circulant block (BCCB) matrix by suitably rearranging the entries of its argument such that the matrix product between C(x) ∈ R N×N and h yields the convolution between the image x and the filter h [68,73]. In this setup, the filter estimation problem is addressed by solving the constrained minimization problem: where · T is the transpose operator; C(d v ) ∈ R N×N and C(d h ) ∈ R N×N are BCCB matrices that are computed from the filters d v and d h and that perform the first-order finite difference in the vertical and horizontal directions, respectively; and 1 is a row vector of all ones. This formulation derives from the combination of different terms. The first one is the so-called data-fitting term, which is the main quantity to be optimized. Then, there are two regularization terms, introduced due to the ill-posedness of this estimation problem [74] that put some constraints on the resulting solution. The first regularization term (namely, h 2 ) helps to obtain a limited-energy solution. On the other hand, the second regularization term (namely, C(d v )h 2 + C(d h )h 2 ) is useful to enforce a smooth solution. The weights λ and µ aim to tune the importance of the regularization terms when finding the solution h, which is subject to two constraints: it has to be normalized (i.e., h T 1 = 1) and limited to a finite, non-empty and convex support, H ⊂ R N . As stated in [30,68], the choice of the squared 2 norm · 2 allows one to solve the problem in (3) in a closed form. Indeed, the quadratic cost function within the minimization problem in (3) attains its (global) minimum when where I is the identity matrix and · H indicates the Hermitian transpose operator.
A computational efficient solution of (4) exists and can be explicitly written in the frequency domain as where the • symbol indicates pointwise (entry-by-entry) multiplication and division operations; F {·} and F −1 {·} are the 2-D discrete Fourier transform operator and its inverse, respectively; and (·) * denotes the complex conjugate. Indeed, due to their BCCB structure, the matrices C(x), C(d h ) and C(d v ), can be diagonalized by the 2D discrete Fourier transform matrix, leading to a computational cost dominated by the number of operations required to perform the fast Fourier transform (FFT) transform, i.e., O(N log N). Finally, in order to fully profit from the FFT, which assumes a periodic boundary structure of the image, a preprocessing step aimed at smoothing out the unavoidable discontinuities present in the real-world images is needed. The adopted solution relies on blurring the image borders, as suggested in the image processing literature [75].
In the following, we briefly describe the two most promising and effective approaches to complete the filter estimation task for pansharpening, i.e.,: • The filter estimation (FE) method [30], which estimates a single filter for all the MS channels; • The multi-band filter estimation (MBFE) algorithm [68], which overcomes the main limitation of the FE by estimating a filter for each band.

FE Algorithm
This approach consists of estimating a single filter; i.e., for all spectral bands b ∈ {1, · · · , B}, h b is equal to the same estimated filter, say h P . The P subscript refers to the implementation of the FE algorithm that exploits the relationship between the PAN image p ∈ R N and an equivalent PAN image p e ∈ R N generated by projecting the MS image into the PAN domain via the formula In (6), ..,B and the all-ones row vector 1, and w aug = w T , w 0 T is composed by the vector w = [w 1 , · · · , w b , · · · , w B ], whose elements are the weights measuring the overlap between the PAN image and each spectral band, and w 0 is a bias coefficient. The algorithm is based on the following two interdependent alternating steps, starting from an initial estimate for the filter h P . • Estimation of the weights w aug . This step consists of imposing the equality between the image p e defined in (6) and a low-pass filtered version of P b computed via the current estimate of h P . Therefore, the estimate of the weights w aug is easily found via a classic multivariate regression framework. • Estimation of the filter h P . This estimate is found by using (5) in which p e plays the role of the blurred and degraded image y and p plays the role of the matrix x. The resulting filter is finally normalized (in order to have a unitary gain) and the values outside a given window are set to zero.
In order to help a fast convergence of the iterative algorithm (usually in a couple of iterations), the Starck and Murtagh low-pass filter [76] (used in pansharpening in the popular "à trous" algorithm) is chosen as initial guess for the filter h P . Moreover, the maximum number of iterations is fixed to 10, in order to ensure that the algorithm stops.

MBFE Algorithm
An effective algorithm aimed at estimating a specific degradation filter h b for each b ∈ {1, · · · , B} is the MBFE method [68]. The low coherence between some bands of the MS image and the PAN image prevents satisfying performance by estimating h b directly from the PAN image [30]. This problem is solved by generating an initial estimate of M ..,B ) that can be used as an approximation of the ground-truth (GT) for the filter estimation task, ensuring higher coherence between the high and the low resolution MS images.
Natural candidates for estimating M 0 are the CS-based methods. Indeed, they are able to generate a fused product that completely retains the PAN spatial details [30], which are key for an accurate estimation of the filters. Therefore, for each band b ∈ {1, · · · , B}, the estimation of h b is performed by using (5) in which m b plays the role of the blurred and degraded image y and m 0 b (i.e., the lexicographic ordered version of M 0 b ) plays the role of the matrix x. Among the CS algorithms, the band-dependent spatial-detail (BDSD) technique [77] and the Gram-Schmidt adaptive (GSA) method [17] have been proved to generate initial guess images M 0 that are well-suited for use in MBFE [68]. Therefore, in the following, we will consider two MBFE variants: the MBFE BDSD and the MBFE GSA.

MLR-Based Injection
In this section we briefly present the MLR-based injection scheme that is the second pillar of the proposed pansharpening architecture. This scheme is a natural extension of the following classical approaches, in which F b [·] is linear.
• CBD Injection Scheme. In the context-based decision (CBD) injection model, for each channel b, the details of the PAN image are multiplied by a scalar coefficient, namely, The injection coefficients g b , ∀b ∈ {1, · · · , B} are computed by the regression of the b-th MS channel on the PAN images. It is worth noting that this scheme is also used in other pansharpening algorithms, such as the aforementioned GSA. • HPM Injection Scheme. The high-pass modulation (HPM) injection scheme relies on the pointwise multiplication of the PAN details by a coefficient matrix G b , according to Additionally, in this case, other pansharpening algorithms use this scheme, such as the Brovey transform (BT), which is a classic multiplicative scheme belonging to the CS family [78].
On the contrary, the proposed scheme employs a non-linear injection function F b [·] in (1) that is approximated via a polynomial expansion of order M around zero, i.e., This formulation is linear with respect to the coefficients {g b,m } m=0,...,M ; therefore, it is possible to use the MLR framework [79] that is the multidimensional extension of the classic ordinary least square method. More in detail, we should estimate the coefficients {g b,m } m=0,...,M by solving the problem where are the details of the target MS image and the optimal (in the leastsquares sense) coefficients {g b,m } m=0,...,M minimize the Frobenius norm of the residuals R b . Unfortunately, it is impossible to compute the details of the MS image, because this approach would require the knowledge of M b -that is, the image to estimate. Therefore, the solution is to solve the reduced resolution companion problem defined in terms of the corresponding reduced resolution versions, indicated by the superscript RR. More specifically, where P RR b is the downsampled version of P L b , defined in Section 2 (see also Figure 2). Finally, according to the findings of [29], we use M = 2, which shows a good trade-off between the complexity of the model and its performance.

Experimental Results
The crucial phase of this study was constituted by the experimental tests that have allowed us to verify the suitability of the proposed technique. Due to the above-mentioned lack of a reference image, the assessment of fusion algorithms had to be performed in two distinguished steps, involving the evaluation of the performance at both reduced scale and full scale [1]. The former consisted of reproducing the fusion problem at a lower resolution through the appropriate degradation of the available images. In particular, the choice of a scaling factor equal to the resolution ratio R allows one to downsize the PAN image at the resolution of the original MS image, which can thus be used as the fusion target. In this case, some indexes can be used for the evaluation of the image quality. On the other hand, changing the working resolution does not represent an ideal solution for two main reasons. The information concerning the illuminated area is somewhat different at the two scales, due to a substantial reduction of perceivable details. Moreover, the image processing algorithms employed for coarsening the available images can only approximate the sensor acquisition, leading to significant deviations form the actual operating scenarios. For this reason, the analysis of the algorithms' behavior at the original resolution is mandatory, especially if the reliability of the applicable quality measures is highly questionable.
We utilized two different datasets for the assessment of the proposed technique, comparing it to the MS image interpolation using a polynomial kernel with 23 coefficients (EXP) and many classical approaches, i.e., non-linear IHS (NLIHS) [15], Gram-Schmidt (GS) [16], Gram-Schmidt adaptive (GSA) [17], band-dependent spatial-detail (BDSD) [77], smoothing filter-based intensity modulation (SFIM) [69], additive à trous wavelet transform (ATWT) [26] and the pyramidal decomposition scheme using morphological filters based on half gradient (MF-HG) [80]. Obviously, we included in the comparison the baseline methods that constitute the starting point of the proposed approach, namely, the GLP using MTF-matched filter [21] with both multiplicative (HPM) [26] and regression-based (CBD) [70] injection models. Moreover, we considered the more recent versions of the MTF-GLP approaches that include the filter estimation procedure, based on the estimation of either a single filter (FE) [30], or a different filter for each band (MBFE) [68]. Analogously, we report the results related to the sole introduction of the MLR injection scheme [29] within the baseline methods.

Datasets
The datasets were selected to cover the most typical settings encountered in the practice. For this reason, two different cases have been considered: the fusion of a PAN image with a 4-band MS image and the fusion of a PAN image with an 8-band MS image. Moreover, the two datasets refer to different scenarios, one constituted by a mountainous, partly vegetated area, and one representing an urban zone.
The China dataset was also used in the assessment of the classical methods presented in [1] and is composed by images collected by the IKONOS sensor over the Sichuan region in China (see the images on the left in Figure 3). The MS image had four channels (blue, green, red and near infra-red (NIR)) and the spatial sampling interval (SSI) was 1.2 m. The IKONOS resolution ratio between the MS and the PAN image was R = 4 and the radiometric resolution was 11 bits. A cut of size 300 × 300 pixels of the original MS image was employed in this work as the ground-truth (GT) for the reduced resolution assessment.
The Tripoli dataset was acquired by the WorldView-3 satellite and was used to test the capability of the proposed method for the enhancement of an MS with a larger number of bands (coastal, blue, green, yellow, red, red edge, NIR1 and NIR2). The employed imagery included a PAN image of size 1024 × 2048 pixels and an MS image of size 256 × 256 pixels, which was also in this case R = 4. The MS WorldView (WV)-3 sensor is characterized by a 1.2 m SSI and the radiometric resolution is 11 bits. The Tripoli dataset was used both at the original scale for performing the full resolution assessment and at a lower scale, obtained by degrading the original images by a factor R = 4.

Reduced Resolution Validation
The reduced resolution (RR) assessment represents a very crucial phase in the evaluation of algorithm performance, since the availability of the reference image allows one to accurately evaluate the quality of the final product. The method consists of using the original MS image as the target product that has to be obtained by combining the degraded versions of the same MS image and of the original PAN. The procedure for completing such experiments has been formalized through Wald's protocol [81], which is based on both the consistency and the synthesis properties. While the latter points at specifying the characteristics of the fused image, the former requires that the fused high resolution MS image obtains the low resolution MS image once properly degraded. This means that the degradation systems should reproduce the overall acquisition process that yields the real images. Accordingly, the amplitude frequency responses of the MS degradation filters are matched to the MTFs of the MS sensor, while an ideal filter is employed to decimate the PAN image [21].
The quality of the fused products can thus be assessed through several indexes that have been developed. We adopted four widespread measures: the well-known peak signalto-noise ratio (PSNR); the relative dimensionless global error in synthesis (ERGAS) [82] that is a normalized version of the root mean square error (RMSE); the spectral angle mapper (SAM) [83] that quantifies the spectral distortion as the mean angle between the fused and reference pixel vectors; the Q2 n [84,85] that extends the universal image quality index (UIQI) [86] to multi-channel images.
The main results of the RR assessment are summarized in Table 1, where the comparison of the proposed technique with both the baseline methods and the other cited pansharpening approaches is shown.
In order to give some additional insight about data fusion performance, we also present some closeups for the two RR data in Figures 4 and 5, focusing only on the GLP details extraction scheme, we show several Q2 n maps for the Tripoli dataset in Figure 6.

Full Resolution Validation
The full resolution (FR) assessment allowed us to analyze the behavior of the algorithms at their effective working scale. In particular, the Tripoli dataset contains images with high resolution details, namely, with physical dimensions very similar to the SSI of the sensors. Accordingly, it can be properly used for this second investigation phase, in which the visual analysis assumes a central role. In fact, all the available quality indexes cannot be considered totally reliable because they assess the final product without a reference image.
For this reason, the quantitative evaluation is typically performed by measuring the coherence of the pansharpened product with the original available images. In particular, one assesses the spectral similarity of the fused image and the low resolution MS image and the correspondence between the PAN and MS spatial details at the original and enhanced resolutions [87,88].
The quality with no reference (QNR) [88] index is the best known measure adopting this rationale. It is composed by a spectral index that measures the relationships among the MS channels and a spatial index that quantifies the quantity and the appropriateness of the spatial details present in each band. Several other quality indexes have been proposed in the literature for the full resolution assessment [23,[89][90][91][92]. We adopted here the hybrid QNR (HQNR) [93] that combines the use of the QNR spatial index D S and of the spectral index D λ proposed in [89], thereby providing appreciable soundness and computational efficiency [5]. Table 2 reports the values of the HQNR indexes computed by applying the considered pansharpening algorithms to the Tripoli dataset.

Discussion
The reduced resolution and the full resolution assessment protocols allowed us to provide a clear illustration of the results achievable through the proposed pansharpening scheme. The key considerations that can be derived from these two complementary phases are detailed in the following sections.

Reduced Resolution
The most evident property is the high performance achieved by the proposed approach in both the tests. This behavior stood out on our datasets, presenting complementary features of the illuminated scene. Indeed, as it can be noticed by examining most of the compared algorithms, it is difficult to find algorithms that were characterized by optimal performance in both the scenarios. Additionally, the baseline methods, namely, the implementations of the GLP approaches exploiting the HPM and the CBD injection schemes, were not immune to this performance tradeoff, due to the large presence of high resolution details in the Tripoli datasets, whose counterparts are the large homogeneous zones in the China dataset.
In both the cases, the best results in terms of the most comprehensive index, namely, the Q2 n , were obtained by the methods that utilize the multi-band filter estimation. In particular, the MBFE-BDSD-MLR approach achieved the highest value for the China dataset, and the MBFE-GSA-MLR produced the best image for the Tripoli dataset. An important remark regards the single filter estimation approach (FE-MLR) that showed remarkable robustness, since it achieved results very similar to the best methods; this was partly due to the shape of the MTFs of the various bands, which are characterized by almost equal gains at the Nyquist frequency. Moreover, one can note that the improvement of the final product quality implied by the filter estimation procedure is always in terms of spectral quality of the image, as demonstrated by the higher values of the SAM index with respect to the GLP-MLR. In fact, the MLR coefficient estimation points to optimizing the detail injection scheme for each specific channel, without taking into account the spectral coherence of the final product. Naturally, this issue is made worse by the multiband approach, which estimates a different filter for each band, causing a more significant spectral unbalance in the pansharpened image. Nevertheless, this issue is largely compensated by the greater ability of injecting the most useful spatial information contained in the PAN image.
The suitability of the proposed approaches is also testified by the closeups shown in Figures 4 and 5 that highlight the capability of producing images with accurate spatial reproduction of the details, without excessively sacrificing the chromatic fidelity of the objects presents in the scene. The performance analysis can be eased by evaluating the algorithms in pairs, as we show in Figure 6, where the MBFE-GSA-MLR is compared, in terms of Q2 n map, to six other methods based on the GLP detail-extraction scheme. The green pixels represent the zones of the images in which the MBFE-GSA-MLR achieved higher quality index scores with respect to the competitors, and the red pixels highlight the opposite. The proposed method achieved almost uniform performance improvements with respect to the approaches compared in panels (a)-(d). More specifically, figures (b) and (d) show that the MBFE-GSA-MLR is significantly superior to the algorithms that implement either the MLR injection scheme (GLP-MLR) or the MBFE technique (MBFE-GSA-CBD), thereby motivating the joint use of the two methods. The uniformity of the green pixels demonstrate that the performance increase was not due to the improvement of a specific zone of the image, but rather to a more precise evaluation of the best extraction filter and of a specific formula for the data combination. Panels (e) and (f) illustrate a more diversified result that is a consequence of an alternative injection scheme. In fact, the HPM modulated the PAN details point-wise, thereby achieving a very different final product. However, the higher overall quality of the MBFE-GSA-MLR approach can be easily argued by the larger extent of the green zones, which are also characterized by high color saturation, indicating a significant improvement of the Q2 n value.

Full Resolution
The results corroborate the analysis carried out at reduced resolution, showing that the approaches using an estimated filter for the extraction and a multilinear regression for the injection obtained the best overall results. In particular, the use of a specific index D S assessing the spatial quality of the images allows one to confirm the deduction that the main improvements were obtained in terms of a more faithful reproduction of the geometric information.
Further information can be derived from the visual analysis of the pansharpened products. We present in Figure 7 the injected details, namely, the differences between the final products M and the upsampled image M. Panels (m) and (n) immediately stand out for the richness and intelligibility of the representation, which testify the accurate reproduction of the object borders detectable at the highest scale. Moreover, although Table 2 confirms that the proposed methods resulted in slightly worse spectral quality of the final products, the homogeneity of the detailed images demonstrates that the particulars were not excessively boosted in any specific zone or band, as happened for the HPM-based schemes.

Computational Analysis
The analysis of the computational complexity of the proposed approach is finally reported in Table 3, which contains the times required by an Intel®Core™I7 3.2GHz processor to complete the fusion process. Almost all the approaches exploiting a multiresolution decomposition of the images required perceptibly more computational effort, since the considered images are quite large. A further increase occurred for the filter estimation procedure, whose effort is proportional to the number of impulse responses to be estimated. Accordingly, the multiband (MBFE) approach took twice as much computational time as the baseline GLP approach in the case of eight bands, though the additional effort required by the single filter (FE) method is almost negligible. In any case, the main point is that the proposed approach strictly preserves the feasibility of the classical methods, thereby representing a viable technique for processing a large amount of data.

Conclusions
In this work, a step forward has been made with respect to existing techniques for the efficient combinations of multispectral and panchromatic images acquired by the same satellite. The usefulness of pansharpened data for many applications demands the capacity of providers to efficiently perform the fusion process, and thus most algorithms still resort to physical models to ease their adaptation to specific datasets.
In this class of approaches, the generalized Laplacian pyramid has emerged as the most widespread method, since it combines the accurate reproduction of the acquisition process characteristics with high computational efficiency. However, some improvements that do not result in an excessive computational burden are conceivable. More specifically, in this work we validated the joint use of a filter estimation procedure, which allows one to easily adapt the shape of the detail-extraction filters to the specific imagery, and of a polynomial combination function, which allows one to more properly inject the PAN information.
The effectiveness of the proposed scheme has been tested on two different datasets, which are characterized by unalike features of the illuminated scene and have been acquired by different sensors. The most important quality of the designed approach is the capacity to achieve the best performance among the tested methods in both the scenarios, differently from all the existing techniques. Among the possible implementations of the proposed approach, it has also been highlighted in this study that the estimation of a single filter for all the multispectral image channels allows one to obtain a still more efficient algorithm, without significantly sacrificing the overall quality of the final product.
Finally, future studies and developments will be devoted to extending the proposed architecture to hyperspectral sharpening, due to the great interest for the related applications [94,95], and the fusion of thermal data [96].
Funding: This research received no external funding.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available since have been removed from the original url.

Conflicts of Interest:
The authors declare no conflict of interest.