Sparsity-Based Spatiotemporal Fusion via Adaptive Multi-Band Constraints

Remote sensing is an important means to monitor the dynamics of the earth surface. It is still challenging for single-sensor systems to provide spatially high resolution images with high revisit frequency because of the technological limitations. Spatiotemporal fusion is an effective approach to obtain remote sensing images high in both spatial and temporal resolutions. Though dictionary learning fusion methods appear to be promising for spatiotemporal fusion, they do not consider the structure similarity between spectral bands in the fusion task. To capitalize on the significance of this feature, a novel fusion model, named the adaptive multi-band constraints fusion model (AMCFM), is formulated to produce better fusion images in this paper. This model considers structure similarity between spectral bands and uses the edge information to improve the fusion results by adopting adaptive multi-band constraints. Moreover, to address the shortcomings of the `1 norm which only considers the sparsity structure of dictionaries, our model uses the nuclear norm which balances sparsity and correlation by producing an appropriate coefficient in the reconstruction step. We perform experiments on real-life images to substantiate our conceptual augments. In the empirical study, the near-infrared (NIR), red and green bands of Landsat Enhanced Thematic Mapper Plus (ETM+) and Moderate Resolution Imaging Spectroradiometer (MODIS) are fused and the prediction accuracy is assessed by both metrics and visual effects. The experiments show that our proposed method performs better than state-of-the-art methods. It also sheds light on future research.


Introduction
Remote sensing satellites are important tools for the monitoring of processes such as vegetation and land cover changes on the earth surface [1][2][3].Because of technological limitations in sensor designs [4], compromises have to be made between spatial and temporal resolutions.For example, Moderate Resolution Imaging Spectroradiometer (MODIS) can visit the earth once a day with 500-m spatial resolution.As a comparison, the spatial resolution of Landsat Enhanced Thematic Mapper Plus (ETM+) is 30 m, but its revisiting period is 16 days.Such a limitation restricts the application of remote sensing in problems that need images high in both spatial and temporal resolutions.Spatiotemporal reflectance fusion models [5] have thus been developed to fuse image data from different sensors to obtain high spatiotemporal resolution images.
The Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) [6] is a pioneering fusion model based on a weighting method.This model uses neighboring pixels to compute the center pixel at a point in time with a weighting function, and the weights are determined by spectral difference, temporal difference and location distance.Furthermore, Zhu et al. [7] proposed an Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) based on a STARFM algorithm to predict the surface reflectance of heterogeneous landscapes.Another improved STARFM method is a Spatial Temporal Adaptive Algorithm for mapping Reflectance Change (STAARCH) [8], which is designed to detect disturbance and changes in reflectance by using Tasseled Cap transformations.However, performance of the weighting methods are constrained because linear combination smooths out the changing terrestrial contents.
Another type of reflectance fusion method, known as dictionary learning methods, has been proposed to overcome the shortcoming of the weighting methods.Dictionary-based methods that use certain known dictionaries, such as wavelets and shearlets, have been proved to be efficient in multisensor and multiresolution image fusion [9][10][11].In remote sensing data analysis, Moigne et al. [12] and Czaja et al. [13] proposed remote sensing image fusion methods based on wavelets and wavelet packet, respectively.Shearlet transform is also used in a fusion algorithm in [14] because shearlets can share the same optimality properties and enjoy similar geometrical properties.Using the capability of dictionary learning and sparsity-based methods in the super resolution analysis, Huang et al. [15] proposed a Sparse-representation-based Spatiotemporal Reflectance Fusion Model (SPSTFM) to integrate sparse representation and reflectance fusion by establishing correspondences between structures in high resolution images and their corresponding low resolution images through dictionary pair and sparse coding.SPSTFM assumes that high and low resolution images of the same area have the same sparse coefficients.Such assumption is, however, too restrictive [16].Based on this idea, Wu et al. [17] proposed the Error-Bound-regularized Semi-Coupled Dictionary Learning (EBSCDL) model which assumes that the representation coefficients of the image pair have a stable mapping and coefficients of the dictionary pair have perturbations in the reconstruction step.Attempts have been made to improve the performance of the SCDL based models.For examples, Block Sparse Bayesian Learning for Semi-Coupled Dictionary Learning (BSBL-SCDL) [18] employs the structural sparsity of the sparse coefficients as a priori knowledge and Compressed Sensing for Spatiotemporal Fusion (CSSF) [19] considers explicitly the down-sampling process within the framework of compressed sensing for reconstruction.In comparison with the weighting methods, the advantage of the dictionary-learning-based methods is that they retrieve the hidden relationship between image pairs from the sparse coding space to better capture structure changes.
Besides the aforementioned methods, some researchers employed other approaches to fuse multi-source data.Unmixing techniques have been suggested for spatiotemporal fusion because of their ability to reconstruct images with high spectral fidelity [20][21][22][23][24]. Considering the mixed-class spectra within a coarse pixel, Xu et al. [25] proposed the Class Regularized Spatial Unmixing (CRSU) model.This method is based on the conventional spatial unmixing technique but is modified to include prior class spectra estimated by the known image pairs.To provide a formal statistical framework for fusion, Xue et al. [26] proposed Spatiotemporal Bayesian Data Fusion (STBDF) that makes use of the joint distribution to capture implicitly the temporal changes of images for the estimation of the high resolution image at a target point in time.
Considering structure similarity in spectral bands, structure information has been employed in pan-sharpening and image fusion.Shi et al. [27], for example, proposed a learning interpolation method for pan-sharpening by expanding sketch information of the high-resolution panchromatic (PAN) image which contains the structure features of the PAN image.Glasner et al. [28] verified that many structures in a natural image are similar at the same and different scales.Inspired by this, a self-learning approach was proposed by Khateri et al. [29] which uses similar structures at different levels to pan-sharpen the low resolution multi-spectral images.In multi-modality image fusion, Zhu et al. [30] proposed a method which decomposes images into cartoon and texture components, and preserves the structure information of two components based on spatial-based method and sparse representation, respectively.
However, none of these spatiotemporal fusion methods consider the structure similarity between spectral bands in the fusion procedure.Although different bands have different reflectance ranges, the edge information is still similar [31].Obviously, a reconstruction model can have a better performance if such information can be effectively used to predict the unknown high resolution image.Otherwise, the dictionary pair obtained by the training image pair are inefficient to predict the unknown images because of the lack of information for the target time.This can be explained from the experience in machine learning in which the 1 norm is too restrictive in encoding the unknown data in the prediction process because it only uses the sparsity structure of the dictionary [32,33].Therefore, the reconstruction model needs a replacement of the 1 norm to reduce the impact of insufficient information and to improve the representation ability of the dictionary pair.
We propose a new model in this paper to enhance spatiotemporal fusion performance.Our model uses the edge information in different bands via adaptive multi-band constraints to improve the reconstruction performance.To overcome the disadvantage of the 1 norm, the nuclear norm is adopted as the regularization term to increase the efficiency of the learnt dictionary pair.Nuclear norm considers not only the sparsity but also the coordination in producing a suitable coefficient that can harmonize the sparse and collaborative representations adaptively [32,33].
Overall, the main contributions of this work can be summarized as follows.
• The multi-band constraints are employed to reinforce the structure similarity of different bands in spatiotemporal fusion.

•
Considering the different structure similarity between two bands, the adaptive regularization parameters are proposed to determine the importance of each multi-band constraint adaptively.

•
The nuclear norm is employed to replace the 1 norm in the reconstruction model because the nuclear norm considers both sparsity and correlation of the dictionaries and can overcome the disadvantage of the 1 norm.
The remainder of this paper is organized as follows.Our method for spatiotemporal fusion, called adaptive multi-band constraints fusion model (AMCFM), is proposed in Section 2. Section 3 discusses the experiments carried out to assess the effectiveness of the AMCFM and four state-of-the-art methods in terms of statistics and visual effects.We then conclude the paper with a summary and direction for future research in Section 4.

Problem Definition
In the following, MODIS images are selected as low resolution images and Landsat ETM+ images are selected as high resolution images.As shown in Figure 1, our spatiotemporal fusion model requires three low resolution images M 1 , M 2 and M 3 , and two high resolution images L 1 and L 3 .The high resolution image L 2 is the target image that we want to predict.Let L ij (L ij = L i − L j ) and M ij (M ij = M i − M j ) be the high and low resolution difference images between t i and t j (i, j ∈ {1, 2, 3}), respectively.We assume that changes of remote sensing images between two points in time are linear.For effectiveness, the dictionary pair D l and D m is trained by the difference image pair L 31 and M 31 [15].Then, the high resolution difference image L 21 can be produced by using the dictionary pair to encode the corresponding low resolution difference image M 21 .L 32 can be obtained in the same way.Finally, the high resolution image at time t 2 can be predicted as follows: The weights W 21 and W 32 we used are same as those in [19], which take the average of the two predicted difference images.

Dictionary Learning Fusion Model
As mentioned above, the conventional dictionary learning fusion models are usually comprised of two steps: the dictionary pair training step and the reconstruction step.The whole process is performed on each band separately.Here, we show the mathematical formulation of these two steps in SPSTFM [15], which is the initial dictionary learning model.

Dictionary pair training step:
In the training step, the difference image pair, L 31 and M 31 , is used to train the high resolution dictionary D l and the corresponding low resolution dictionary D m as follows: where Y and X are the column combination of the lexicographically stacked image patches, sampled randomly from L 31 and M 31 , respectively.A is the column combination of the representation coefficients corresponding to every column in Y and X, and λ is the regularization parameter.We adopt the K-SVD (K is the abbreviation for K-means and SVD is the abbreviation for Singular Value Decomposition) lgorithm [34] to solve for D l and D m in Equation ( 2). •

Reconstruction step:
Then, D m is used to encode each patch of M 21 and the sparse coding coefficient α is obtained by solving the optimization problem: where m 21 is a patch of M 21 .The corresponding patch of the high resolution image can be produced by Then, all patches of L 21 are merged to get the high resolution image L 21 .L 32 can be obtained in the same way and the target image L 2 can be predicted through Equation (1).

Adaptive Multi-Band Constraints Fusion Model
Our model uses the same strategy for dictionary pair training and focuses on the improvement of the reconstruction step.We propose the following model for spatiotemporal fusion by replacing the 1 norm with the nuclear norm • * and incorporating the multi-band constraints.The reconstruction formulation is given as follows: where λ, τ NR , τ RG and τ GN are the regularization parameters.M * is the nuclear norm of M that is the sum of all the singular values of matrix M. For a vector v, Diag(v) represents a diagonal matrix whose diagonal elements are the corresponding elements of the vector v. S is a high-pass detector filter.Here, we choose the two-dimensional Laplacian operator.The subscripts N, R and G mean the near-infrared (NIR), red and green band, respectively.The dictionary pair D l and D m is trained by the difference images L 31 and M 31 which do not contain sufficient information of the images at time t 2 .When reconstructing L 21 or L 32 , if the model only uses the 1 norm regularization, then the performance are unsatisfactory.It is more reasonable to integrate sparsity and correlation of the dictionaries.The nuclear norm term is just the kind of regularization that can adaptively balance sparsity and correlation via a suitable coefficient.As the property shown in [33,35], will be close to α c 2 [35].Generally, remote sensing images in the dictionary D m c are neither too independent nor too correlated because the test images and training images can contain high correlative information (i.e., stable land-cover) and independent information (i.e., land-cover change).Therefore, as shown in Equation ( 6), our model can benefit from both the 1 norm and the 2 norm.The advantage of the nuclear norm is that the nuclear norm can capture the correlation structure of the training images which the 1 norm cannot.The last three terms in the model are the multi-band regularization terms.Taking the NIR band as an example, D l N α N denotes a high resolution patch of the NIR band and SD l N α N stands for the edge in the patch.These terms make the sparse codes (The codes may not be sparse based on the nuclear norm regularization, but, for convenience and without confusion, we still call them sparse codes.) in different bands no longer independent and reinforce the structure similarity of different bands.
Nevertheless, the nuclear norm regularization and multi-band regularization make it more complicated to solve the model.In Section 2.5, we propose the method to get a solution efficiently.

Adaptive Parameters between Bands
The ranges of reflectance in different bands are different in remote sensing images.In natural images, the range of the three channels is [0, 255].Table 1 implicitly shows the range differences of three bands in terms of mean and standard deviation.Obviously, the structures are more similar when the means and standard deviations are closer.Based on this rationale, we propose an adaptive regularization parameter as follows: where mc is the mean value of band c and σ c is the standard deviation of band c; γ is a parameter to control the magnitude; and τ RG and τ GN can be obtained by the definition as well.This parameter estimates the distribution of the reflectance values between two bands and produces a suitable parameter adaptively.When two bands have similar reflectance values, the parameter is close to γ.When the difference between two bands increases, the parameter decreases exponentially.The more similar two bands are, the more important a role the corresponding term plays in the model.This property fits such intuitive perception.

Optimization of the AMCFM
In this section, we solve the reconstruction model.For the optimization, a simplification is made first.We introduce the following vectors and matrices: where α and m are concatenations of the sparse coefficients and low resolution image patches, respectively.D m is a dictionary that contains low resolution dictionaries of three bands in its diagonals.We also define B as: Then, Equation ( 5) can be simplified as follows: Here, we use the alternating direction method of multipliers (ADMM) [36][37][38] algorithm to approximate the optimal solution of Equation ( 9).The optimization problem can be written as follows: where Z is the dual variable in the ADMM algorithm.The augmented Lagrangian function L of the optimization problem is given as where ρ is a positive scalar and µ is a scaled variable.The ADMM consists of the following iterations: To minimize the augmented Lagrangian function, we solve each of the subproblems in Equation ( 12) by fixing the other two variables alternatively.For the step of updating α k+1 , α k+1 can be deduced as follows: where For a matrix M, diag(M) represents a vector whose ith element is the ith diagonal element of the matrix M.
For the step of updating Z k+1 , Z k+1 can be calculated by the singular value thresholding operator [39] as follows: where D λ ρ is the singular value shrinkage operator, which is defined as follows: where λ ρ is a positive scalar, UΣV * is the singular value decomposition of a matrix X, σ i is the ith positive singular value of X, and max Now, we use UΣV * to denote the singular value decomposition of (D m Diag(α k+1 ) + µ k ) and use σ i to denote the ith positive singular value of (D m Diag(α k+1 ) + µ k ).Then, The implementation details of the whole reconstruction procedure based on the ADMM algorithm can be summarized in Algorithm 1.

Algorithm 1 Reconstruction Procedure of the Proposed Method
1: Input: The regularization parameter λ, learnt dictionary pair D l and D m , low resolution difference image and initial parameters α 0 , Z 0 , µ 0 , ρ. 2: Preprocessing: Normalize the low resolution difference image, then segment the low resolution difference image into patches M = {m i } N i=1 with a 7 × 7 patch size and a four-pixel overlap in each direction.3: Calculate: The structure similarity parameters τ NR , τ RG and τ GN .4: Repeat: (1) Update the sparse coefficient α k+1 as: (2) Update Z k+1 as: ) Update µ k+1 as: Repeat the above procedure until the convergence criterion D m Diag(α) − Z F ≤ ε is met or the pre-specified number of iterations is reached and get the desired sparse coefficient α * .5: Output: The corresponding patch of the high resolution image can be reconstructed as l = D l α * and the predicted image L can be obtained by merging all patches.

Strategy for More Bands
The proposed method considers the structure similarity of different bands and uses pairwise comparisons of the NIR, red and green bands.It should be noted that the relationship between n and m is quadratic (m = n 2 −n 2 ), where n and m represent the number of bands and the number of multi-band regularization terms, respectively.When n increases, the model will be much more complicated and difficult to solve.
Table 2 shows that adjacent bands have consistent bandwidths.This property indicates that structures of adjacent bands are more similar than those of the other pairs.It is thus reasonable to use adjacent bands constraints instead of pairwise comparisons of all bands.Otherwise, the number of combinations of the adjacent bands will get to be smaller and m will become linear to n (m = n − 1).Therefore, to efficiently extend it to more bands, we use the strategy that only considers structure similarity of two adjacent bands.This smaller model (AMCFM-s) can be reformulated as follows: The procedure of solving AMCFM-s is the same as that of AMCFM.Details can be found in Section 2.5.

Experiments
The performance of our proposed method is compared to those of the four state-of-the-art methods for evaluation.ESTARFM [7] is a weighting method and CRSU [25] is an unmixing-based method.The other two are dictionary learning methods, named SPSTFM [15] and EBSCDL [17].
All programs are run in Windows 10 system (Microsoft, Redmond, Washington, DC, USA) and the processor is Intel Core i7-6700 3.40 GHz (Intel, Santa Clara, CA, USA).All of these fusion algorithms are coded in Matlab 2015a (MathWorks, Natick, MA, USA) except the ESTARFM, which is in IDL 8.5 (Harris Geospatial Solutions, Broomfield, CO, USA).

Experimental Scheme
In this experiment, we use the data acquired from the Boreal Ecosystem-Atmosphere Study (BOREAS) southern study area on 24 May, 11 July and 12 August in 2001, respectively.The products from Landsat ETM+ and MODIS (MOD09GHK) are selected as the source data for fusion.The Landsat image on 11 July 2001 is set as the target image for prediction.All the data are registered for fine geographic calibration.
In the fusion process, we focus on three bands: NIR, red and green.The size of the test images is 300 × 300.Before the test, we up-sample the MODIS images to the same resolution as the Landsat images via bi-linear interpolation because the spatial resolutions of these two source images are different.

Parameter Settings and Normalization
The parameters of AMCFM are set as follows.The dictionary size is 256, the patch size is 7 × 7, the overlap of patches is 4, the number of training patches is 2000, λ is 0.15, α 0 is 0, Z 0 and µ 0 are both 0, and ρ is 0.1.All the comparative methods keep their original parameter settings.
Normalization can speed up the computation time and has an effect on the fusion results.As a preprocessing step, the high and low resolution images are normalized as follows: where L is the mean value of image L and σ L is the standard deviation of image L.

Quality Measurement of the Fusion Results
Several metrics have been used to evaluate the fusion results by different methods.These metrics can be classified into two types, namely the band quality metrics and the global quality metrics.
We employ three assessment metrics, namely the root mean square error (RMSE), average absolute difference (AAD) and correlation coefficient (CC) to assess the performance of the algorithms in each band.The ideal result is 0 for RMSE and AAD, while it is 1 for CC.
Three other metrics are adopted to evaluate the global performance, including relative average spectral error (RASE) [40], Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS) [41] and Q4 [42].The mean RMSE (mRMSE) of three bands is also used as a global index.The ideal result is 0 for mRMSE, RASE and ERGAS, while it is 1 for Q4.It should be noted that Q4 is defined for four spectral bands.For our comparisons, the real part of a quaternion is set to 0.

Results
Tables 3-5 show the digital values of these methods in each band.All these methods can reconstruct the target high resolution image.ESTARFM has a good performance in the red band.CRSU performs well in the red band and green band of image 2, but, in most cases, this method has undesirable results.SPSTFM and EBSCDL have similar results and EBSCDL produces slightly higher quality in these three images.AMCFM and AMCFM-s produce the best results for NIR band.Moreover, AMCFM has the best or the second best results in almost all metrics, showing the stability and efficiency in its performance.The global metrics of different methods are shown in Tables 6-8.AMCFM has the best global performance in all three images, except for Q4 in image 2 and ERGAS in image 3. Image 1 is best captured by our proposed model with a noticeable performance in all four metrics.The outstanding performance of AMCFM is attributed to its improved performance in the NIR band.Figures 2 and 3 compare the target (true) Landsat images with the images predicted by ESTARFM, CRSU, SPSTFM, EBSCDL, AMCFM and AMCFM-s.We use NIR-red-green band as the red-green-blue-band composite to show the images.These images are displayed with an ENVI 5.3 (Harris Geospatial Solutions, Broomfield, Colorado, United States) 2% linear enhancement.
All these fusion algorithms have the capability to reconstruct the main structure and details of the target image.It appears that the colors of the dictionary learning methods are visually more similar to the true Landsat image than the weighting method and unmixing-based method.The details captured by AMCFM are more prominent than those captured by SPSTFM and EBSCDL, which can be observed in the two-times enlarged red box in the images.Overall, our proposed method has the best performance in visualization.
Figures 4-6 display the 2D scatter plots of NIR, red and green band of image 1. ESTARFM performs slightly better than the other methods in the red band.This result is consistent with the statistics in Table 3.However, in the NIR and green band, it is obvious that dictionary learning methods outperform the weighting method and unmixing-based method because scatter plots of ESTARFM and CRSU are more dispersed.The scatter plots of our proposed methods, AMCFM and AMCFM-s, are closer to the 1-1 line than the other methods, indicating that using the edge information can actually improve fusion performance, especially in the NIR band.In general, Figures 4-6 show that our proposed methods reconstruct images closest to the true Landsat image.

Discussion
Although the model performs well in the experiments, there still exists some questions to be discussed.Therefore, more experiments are performed to answer these questions.

Which Condition Is Better for AMCFM
Tables 6-8 show that image 1 best fits our model.This can be explained by the level of details in Table 9.We employ " StandardDeviation Mean " to represent the level of details of a target image as in [19].It is clear that image 1 has the highest level of details in NIR band and the most similar levels of details in the three bands.Therefore, more structure similarity information can be captured to improve the fusion results.When there is a large divergence in a certain band, such as the red band in image 2, the results of the dictionary learning methods in this band are unsatisfactory.Under this situation, the ESTARFM performs better in red band.

Computational Cost
Computational cost is an important factor in practical application.Table 10 records the running time of all algorithms in image 1.It shows that SPSTFM has the fastest running speed.EBSCDL is time-consuming because the algorithm models the relationship between high and low resolution patches by a mapping function.AMCFM is a little slower than EBSCDL because of the complexity of the ADMM algorithm.However, for the improvement in results obtained, the slightly increased running time is acceptable.To accelerate the computation, an alternative approach can be designed to solve the reconstruction model efficiently, or the program can be coded with Graphics Processing Unit (GPU) support for parallel running in the future work.

Parameters
The parameters of the multi-band constraints determine the importance of the corresponding terms in the fusion model and the scalar γ affects the value of the parameter τ directly.In order to find a suitable γ, Figure 7 depicts how Q4 changes with respect to γ. Q4 is an index which encapsulates both spectral and radiometric measurements of the fusion result.Thus, we choose it to reflect the fusion results.A larger value of Q4 means a better fusion performance.When γ is smaller than 10, the performance of AMCFM evidently improves with the increase of γ.However, Q4 hardly increases when γ is larger than 10.Therefore, we set γ to 10.

Conclusions and Future Work
In this paper, we have proposed a novel dictionary learning fusion model, called AMCFM.This model considers the structure similarity between bands via adaptive multi-band constraints.These constraints essentially enforce the similarities of the edge information across bands in high resolution patches to improve the fusion performance.Moreover, different from existing dictionary learning models which only emphasize on sparsity, we use the nuclear norm as the regularization term to represent both sparsity and correlation.Therefore, our model can reduce the impact of inefficient dictionary pair and improve the representation ability of the dictionary pair.Comparing with four state-of-the-art fusion methods in metrics and visual effects, the experimental results support our proposed model in the improvements of image fusion.Although our model is slower than the other two dictionary learning methods in this empirical analysis because of the complexity of the optimization algorithm, the fusion results obtained from our model are improved indeed.One may wonder whether it is justifiable to achieve a slight improvement on the expense of an increase in computational time.Our argument is that, on a theoretical basis, our model is more reasonable and appealing than SPSTFM and EBSCDL because it capitalizes on the structure information and correlation of dictionaries for image fusion.Such advantages will be more evident when structure similarity increases.
However, there remains some room for improvement.Firstly, the 2 norm loss term assumes that noise is an i.i.d.Gaussian.We can consider the use of other noise hypotheses, such as i.i.d.Gaussian mixture and non-i.i.d noise structure, to improve the fusion results.Secondly, the computation cost of the proposed method is high because of the complexity of the ADMM algorithm.To reduce the computation time, an alternative approach can be designed to solve the reconstruction model efficiently for practical applications.To analyze hyperspectral data efficiently, dimension reduction methods might need to be incorporated into the fusion process.

Figure 1 .
Figure 1.Input images and the target image for spatiotemporal fusion (t 1 < t 2 < t 3 ).Three low resolution images M 1 , M 2 and M 3 , and two high resolution images L 1 and L 3 are known.The high resolution image L 2 is the target image to be predicted.

1 .
where all columns of D m c have unit norm.When the column vectors of D m c are orthogonal, D m c Diag(α c ) * is equal to α c When the column vectors of D m c are highly correlated, D m c Diag(α c ) *

Figure 4 .
Figure 4. Scatter plots of NIR band of image 1. Abscissa is the true reflectance and ordinate is the predicted reflectance.

Figure 5 .
Figure 5. Scatter plots of red band of image 1. Abscissa is the true reflectance and ordinate is the predicted reflectance.

Figure 6 .
Figure 6.Scatter plots of green band of image 1. Abscissa is the true reflectance and ordinate is the predicted reflectance.

Table 1 .
Mean and standard deviation in three bands of a multi-band image.

Table 2 .
Bandwidth of Landsat and MODIS.

Table 3 .
Band quality metrics in image 1.

Table 4 .
Band quality metrics in image 2.

Table 5 .
Band quality metrics in image 3.

Table 6 .
Global quality metrics in image 1.

Table 7 .
Global quality metrics in image 2.

Table 8 .
Global quality metrics in image 3.

Table 9 .
Level of details in a target image.