Multisensor Super Resolution Using Directionally-Adaptive Regularization for UAV Images

In various unmanned aerial vehicle (UAV) imaging applications, the multisensor super-resolution (SR) technique has become a chronic problem and attracted increasing attention. Multisensor SR algorithms utilize multispectral low-resolution (LR) images to make a higher resolution (HR) image to improve the performance of the UAV imaging system. The primary objective of the paper is to develop a multisensor SR method based on the existing multispectral imaging framework instead of using additional sensors. In order to restore image details without noise amplification or unnatural post-processing artifacts, this paper presents an improved regularized SR algorithm by combining the directionally-adaptive constraints and multiscale non-local means (NLM) filter. As a result, the proposed method can overcome the physical limitation of multispectral sensors by estimating the color HR image from a set of multispectral LR images using intensity-hue-saturation (IHS) image fusion. Experimental results show that the proposed method provides better SR results than existing state-of-the-art SR methods in the sense of objective measures.


Introduction
Multispectral images contain complete spectrum information at every pixel in the image plane and are currently applied to various unmanned aerial vehicle (UAV) imaging applications, such as environmental monitoring, weather forecasting, military intelligence, target tracking, etc. However, it is not easy to acquire a high-resolution (HR) image using a multispectral sensor, because of the physical limitation of the sensor. A simple way to enhance the spectral resolution of a multispectral image is to increase the number of photo-detectors at the cost of sensitivity and signal-to-noise ratio due to the reduced size of pixels. In order to overcome such physical limitations of a multispectral imaging sensor, an image fusion-based resolution enhancement method is needed [1,2].
Various enlargement and super-resolution (SR) methods have been developed in many application areas over the past few decades. The goal of these methods is to estimate an HR image from one or more low-resolution (LR) images. They can be classified into two groups: (i) single image based; and (ii) multiple image based. The latter requires a set of LR images to reconstruct an HR image. It performs the warping process to align multiple LR images with a sub-pixel precision. If LR images are degraded by motion blur and additive noise, the registration process becomes more difficult. To solve this problem, single image-based SR methods became popular, including: an interpolation-based SR [3][4][5][6], patch-based SR [7][8][9][10][11][12], image fusion-based SR [13][14][15][16][17], and others [18].
In order to solve the problem of simple interpolation-based methods, such as linear and cubic-spline interpolation [3], a number of improved and/or modified versions of image interpolation methods have been proposed in the literature. Li et al. used the geometric duality between the LR and HR images using local variance in the LR image [4]. Zhang et al. proposed an edge-guided non-linear interpolation algorithm using directionally-adaptive filters and data fusion [5]. Giachetti et al. proposed a curvature-based iterative interpolation using a two-step grip filling and an iterative correction of the estimated pixels [6]. Although the modified versions of interpolation methods can improve the image quality in the sense of enhancing the edge sharpness and visual improvement to a certain degree, fundamental interpolation artifacts, such as blurring and jagging, cannot be completely removed, due to the nature of the interpolation framework.
Patch-based SR methods estimate an HR image from the LR image, which is considered as a noisy, blurred and down-sampled version of the HR image. Freeman et al. proposed the example-based SR algorithm using the hidden Markov model that estimates the optimal HR patch corresponding to the input LR patch from the external training dataset [7]. Glasner et al. used a unified SR framework using patch similarity between in-and cross-scale images in the scale space [8]. Yang et al. used patch similarity from the learning dataset of HR and LR patch pairs in the sparse representation model [9]. Kim et al. proposed a sparse kernel regression-based SR method using kernel matching pursuit and gradient descent optimization to map the pairs of trained example patches from the input LR image to the output HR image [10]. Freedman et al. used non-dyadic filter banks to preserve the property of an input LR image and searched a similar patch using local self-similarity in the locally-limited region [11]. He et al. proposed a Gaussian regression-based SR method using soft clustering based on the local structure of pixels [12]. Existing patch-based SR methods can better reduce the blurring and jagging artifacts than interpolation-based SR methods. However, non-optimal patches make the restored image look unnatural, because of the inaccurate estimation of the high-frequency components.
On the other hand, image fusion-based SR methods have been proposed in the remote sensing fields. The goal of these methods is to improve the spatial resolution of LR multispectral images using the detail of the corresponding HR panchromatic image. Principal component analysis (PCA)-based methods used the projection of the image into the differently-transformed space [13]. Intensity-hue-saturation (IHS) [14,15] and Brovery [17] methods considered the HR panchromatic image as a linear combination of the LR multispectral images. Ballester et al. proposed an improved variational-based method [16]. These methods assume that an HR panchromatic image is a linear combination of LR multispectral images. Therefore, conventional image fusion-based SR methods have the problem of using an additional HR panchromatic imaging sensor.
In order to improve the performance of the fusion-based SR methods, this paper presents a directionally-adaptive regularized SR algorithm. Assuming that the HR monochromatic image is a linear combination of multispectral images, the proposed method consists of three steps: (i) acquisition of monochromatic LR images from the set of multispectral images; (ii) restoration of the monochromatic HR image using the proposed directionally-adaptive regularization; and (iii) reconstruction of the color HR image using image fusion. The proposed SR algorithm is an extended version of the regularized restoration algorithms proposed in [19,20] for optimal adaptation to directional edges and uses interpolation algorithms in [21][22][23] for resizing the interim images at each iteration.
The major contribution of this work is two-fold: (i) the proposed method can estimate the monochromatic HR image using directionally-adaptive regularization that provides the optimal adaptation to directional edges in the image; and (ii) it uses an improved version of the image fusion method proposed in [14,15] to reconstruct the color HR image. Therefore, the proposed method can generate a color HR image without additional high-cost imaging sensors using image fusion and the proposed regularization-based SR method. In experimental results, the proposed SR method is compared with seven existing image enlargement methods, including interpolation-based, example-based SR and patch similarity-based SR methods in the sense of objective assessments.
The rest of this paper is organized as follows. Section 2 summarizes the theoretical background of regularized image restoration and image fusion. Section 3 presents the proposed directional adaptive regularized SR algorithm and image fusion. Section 4 summarizes experimental results on multi-and hyper-spectral images, and Section 5 concludes the paper.

Theoretical Background
The proposed multispectral SR framework is based on regularized image restoration and multispectral image fusion. This section presents the theoretical background of multispectral image representation, regularized image restoration and image fusion in the following subsection.

Multispectral Image Representation
A multispectral imaging sensor measures the radiance of multiple spectral bands whose range is divided into a series of contiguous and narrow spectral bands. On the other hand, a monochromatic or single-band imaging sensor measures the radiance of the entire spectrum of the wavelength. The relationship between multispectral and monochromatic images is assumed to be modeled as the gray-level images between wavelength ω 1 and ω 2 as [24]: where R(ω) represents the spectral radiance through the sensor's entrance pupil and K a constant that is determined by the sensor characteristics, including the electronic gain, detector saturation, quantization levels and the area of the aperture. q(ω) is the spectral response function of the sensor in the wavelength range between ω 1 and ω 2 . η (ω 1 ∼ω 2 ) is the noise generated by the dark signal.
Since the spectral radiance R(ω) does not change by the sensor, the initial monochromatic HR image is generated by Equation (1), and it is used to reconstruct the monochromatic HR image from multispectral LR images using image fusion [14,15]. Figure 1 shows the multispectral imaging process, where a panchromatic image is acquired by integrating the entire spectral band, and the corresponding RGB image is also acquired using, for example, 33 bands. The high-resolution (HR) panchromatic and low-resolution (LR) RGB image are fused to generate an HR color image.

Multispectral Image Fusion
In order to improve the spatial resolution of multispectral images, the intensity-hue-saturation (IHS) image fusion method is widely used in remote sensing fields [14,15]. More specifically, this method converts a color image into the IHS color space, where only the intensity band is replaced by the monochromatic HR image. The resulting HR image is obtained by converting the replaced intensity and the original hue and saturation back to the RGB color space.

Regularized Image Restoration
Regularization-based image restoration or enlargement algorithms regard the noisy, LR images as the output of a general image degradation process and incorporate a priori constraints into the restoration process to make the inverse problem better posed [19][20][21][22][23].
The image degradation model for a single LR image can be expressed as: where g represents the observed LR image, H the combined low-pass filtering and down-sampling operator, f the original HR image and η the noise term. The restoration problem is to estimate the HR image f from the observed LR image g. Therefore, the regularization approach minimizes the cost function as: where C represents a two-dimensional (2D) high-pass filter, λ is the regularization parameter and Cf 2 the energy of the high-pass filtered image representing the amount of noise amplification in the restoration process.
The derivative of Equation (3) with respect to f is computed as: which becomes zero if: Thus, Equation (5) can be solved using the well-known regularized iteration process as: where H T H + λC T C represents the better-posed system matrix, and the step length β should be sufficiently small for the convergence.

Directionally-Adaptive Regularization-Based Super-Resolution with Multiscale Non-Local Means Filter
Since the estimation of the original HR image from the image degradation process given in (2) is almost always an ill-posed problem, there is no unique solution, and a simple inversion process, such as inverse filtering, results in significant amplification of noise and numerical errors [7][8][9][10][11][12]. To solve this problem, regularized image restoration incorporates a priori constraints on the original image to make the inverse problem better posed.
In this section, we describe a modified version of the regularized SR algorithm using a non-local means (NLM) filter [25] and the directionally-adaptive constraint as a regularization term to preserve edge sharpness and to suppress noise amplification. The reconstructed monochromatic HR image is used to generate a color HR image together with given LR multispectral images using IHS image fusion [14,15]. The block-diagram of the proposed method is shown in Figure 2.

Multispectral Low-Resolution Image Degradation Model
Assuming that the original monochromatic image is a linear combination of multispectral images [24], the observed LR multispectral images are generated by low-pass filtering and down-sampling from the differently translated version of the original HR monochromatic image. More specifically, the observed LR image in the i-th multispectral band is defined as: where f (x i ,y i ) represents the translated version of the original HR monochromatic image f by (x i , y i ), H the image degradation operator, including both low-pass filtering and down-sampling, and η the additive noise. In this paper, we assume that there is no warping operation in the image degradation model, because LR images are acquired by the multispectral sensor for the same scene.

Multiscale Non-Local Means Filter
If noise is present in the image degradation model given in Equation (2), the estimated HR image using the simple inverse filter yields: where the term ∆f = H −1 η amplifies the noise in an uncontrollable manner. This process can be considered to solve g = Hf with the observation noise or perturbation η, which result in the amplified error ∆f = H −1 η in the solution.
If ∆f is unbounded, the corresponding image restoration of the SR problem is ill posed. To solve the ill-posed restoration problem, we present an improved multiscale non-local means (NLM) filter to minimize the noise before the main restoration problem. The estimated noise-removed monochromatic image can be obtained using the least-squares optimization as: where f m represents the m-th underlying pixel, Ω s the local region in the 1.25 S -times down-scaled image, for s = {−1, −2, −3}, using the cubic-spline interpolation kernel [3], g P s,n the local patches of g m corresponding to Ω s , w P m,n the similarity weighting value between the local patches g P s,n in the down-scaled image and the corresponding patch in g m and the superscript P a patch.
Since the cubic-spline kernel performs low-pass filtering, it decreases the noise variance and guarantees searching sufficiently similar patches [8]. The similarity weight value is computed in the down-scaled image as: where g P s,n represents pixels in the patch centered at the location of g s,m and g P m pixel in the patch centered at the location of g m in the original scale image. The parameter G is a Gaussian kernel that controls the exponential decay in the weighting computation.
The solution of the least-squares estimation in Equation (9) is given as:

Directionally-Adaptive Constraints
In minimizing the cost function in Equation (3), minimization of g − Hf 2 results in noise amplification, while minimization of Cf 2 results in a non-edge region. In this context, conventional regularized image restoration or SR algorithms [21][22][23] tried to estimate the original image by minimizing the cost function that is a linear combination of the two energies as g − Hf 2 + λ Cf 2 . In this paper, we incorporate directionally-adaptive smoothness constraints into regularization process to preserve directional edge sharpness and to suppress noise amplification as: where the directionally-adaptive constraints C D , for D = 1, ..., 5, suppress the noise amplification along the corresponding edge direction. In this work, we use the edge orientation classification filter [23].
The proposed directionally-adaptive constraints can be implemented using four 2D different high-pass filters as: and: By applying the directionally-adaptive constraints, an HR image can be restored from the input LR image. In the restored HR image, four directional edges are well preserved. In order to suppress noise amplification in the non-edge (NE) regions, the following constraint is used.

Combined Directionally-Adaptive Regularization and Modified Non-Local Means Filter
Given the multispectral LR images g i , for i = 1, ..., L, the estimated monochromatic HR imagef is given by the following optimization:f = arg min where the multispectral extended version of Equation (3) is given as: The derivative of Equation (19) with respect to f is computed as: which becomes zero if: Finally, Equation (21) can be solved using the well-known iterative optimization with the proposed multiscale NLM filter as: where the matrix L i=1 H T i H i + λC T D C D is better-conditioned, and the step length β should be small enough to guarantee the convergence. f k NLM represents the multiscale NLM filtered version that can be expressed as: For the implementation of Equation (22), the term H T i g i = S T i H T g i implies that the i-th multispectral LR images are first enlarged by simple interpolation as: where ⊗ represents the Kronecker product of matrices andH represents the one-dimensional (1D) low-pass filtering and subsampling process with a specific magnification ratio. In order to represent the geometric misalignment among different spectral bands, pixel shifting by

Image Fusion-Based HR Color Image Reconstruction
A multispectral imaging sensor measures the radiance of multiple spectral bands whose ranges are divided into a series of contiguous and narrow spectral bands. In this paper, we adopt the IHS fusion method mentioned to estimate the HR color image from multispectral LR images as [14,15]: where R, G and B bands are computed as: and: where R(ω) represents the spectral radiance, K a constant gain and q(ω) is the spectral response function of the multispectral sensor as defined in Equation (1). The intensity component I is replaced with the estimated monochromatic HR image, and then, the IHS color space is converted back to the RGB color space as: wheref represents the estimated monochromatic HR image and C H , for C ∈ {R, G, B}, is the fused color HR image.
In this paper, we used the cubic-spline interpolation method to enlarge the hue (H) and saturation (S) channels by the given magnification factor. Figure 3 shows the image fusion-based HR color image reconstruction process.

Experimental Results
In this section, the proposed method is tested on various simulated, multispectral, real UAV and remote sensing images to evaluate the SR performance. In the following experiments, parameters were selected to produce the visually best results. In order to provide comparative experimental results, various existing interpolation and state-of-the-art SR methods were tested, such as cubic-spline interpolation [3], advanced interpolation-based SR [4][5][6], example-based SR [7] and patch-based SR [9][10][11][12]. To compare the performance of several SR methods, we used a set of full-reference metrics of image quality, including peak-to-peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM) [26], multiscale-SSIM (MS-SSIM) [27] and feature similarity index (FSIM) [28]. On the other hand, for the evaluation of the magnified image quality without the reference HR image, we adopted the completely blind image quality assessment methods, including the blind/referenceless image spatial quality evaluator (BRISQUE) [29] and natural image quality evaluator (NIQE) [30]. The higher image quality results in lower BRISQUE and NIQE values, but higher PSNR, SSIM, MS-SSIM and FSIM values.
The BRISQUE quantifies the amount of naturalness using the locally-normalized luminance values on a priori knowledge of both natural and artificially-distorted images. The NIQE takes into account the amount of deviations from the statistical regularities observed in the undistorted natural image contents using statistical features in natural scenes. Since BRISQUE and NIQE are referenceless metrics, they may not give the same ranking to the well-known metrics with reference, such as PSNR and SSIM.

Experiment Using Simulated LR Images
In order to evaluate the qualitative performance of various SR algorithms, we used five multispectral test images, each of which consists of 33 spectral bands in the wavelength range from 400 to 720 nanometers (nm), as shown Figure 4. In order to compare the objective image quality measures, such as PSNR, SSIM, MS-SSIM, FSIM and NIQE, the original multispectral HR image is first down-sampled by a factor of four to simulate the input LR images. Next, the input LR images are magnified four times using the nine existing methods and the proposed multispectral SR method.
On the other hand, the proposed method shows a significantly improved SR result by successfully reconstructing the original high-frequency details and sharpens edges without unnatural artifacts. The PSNR, SSIM, MS-SSIM, FSIM, BRISQUE and NIQE values of the simulated multispectral test images shown in Figure 4 are computed for nine different methods, as summarized in Table 1. Table 1. Comparison of peak-to-peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), multiscale-SSIM (MS-SSIM), feature similarity index (FSIM), blind/referenceless image spatial quality evaluator (BRISQUE) and natural image quality evaluator (NIQE) values of the resulting images shown in Figure 4 using nine existing and the proposed SR methods. Based on Table 1, the proposed method gives better results than existing SR methods in the sense of PSNR, SSIM, MS-SSIM and FSIM. Although the proposed method did not always provide the best results in the sense of NIQR and BRISQUE, the averaged performance using the extended set of test images shows that the proposed SR method performs the best.
In the additional experiment, an original monochromatic HR image is down-sampled and added by zero-mean white Gaussian noise with standard deviation σ = 10 to obtain a simulated version of the noisy LR image. The simulated LR image is enlarged by three existing SR [7,9,12] and the proposed methods, as shown in Figure 10. As shown in Figure 10, existing SR methods can neither remove the noise, nor recover the details in the image, whereas the proposed method can successfully reduce the noise and successfully reconstruct the original details. Table 2 shows PSNR and SSIM values of three existing SR methods for the same test image shown in Figure 10.  [7], (d) patch-based SR [9]; (e) patch-based SR [12] and (g) the proposed method. The original version of the example-based SR method was not designed for real-time processing, since it requires a patch dictionary before starting the SR process [7]. The performance and processing time of the patch searching process also depend on the size of the dictionary. The sparse representation-based SR method needs iterative optimization for the 1 minimization process [9], which results in indefinite processing time. Although the proposed SR method also needs iterative optimization for the directionally-adaptive regularization, the regularized optimization can be replaced by an approximated finite-support spatial filter at the cost of the quality of the resulting images [31]. The non-local means filtering is another time-consuming process in the proposed work. However, a finite processing time can be guaranteed by restricting the search range of patches.

Experiment Using Real UAV Images
The proposed method is tested to enhance real UAV images, as shown Figure 11. More specifically, the remote sensing image is acquired by QuickBird equipped with a push broom-type image sensor to obtain a 0.65-m ground sample distance (GSD) panchromatic image. Figure 11. Four real UAV test images. Figures 12 to 15 show the results of enhanced versions using nine different SR and the proposed methods. In order to obtain no-reference measures, such as NIQE and BRISQUE values, Figure 11a-d are four-times magnified. In addition, the original UAV images are four-times down-sampled to generate simulated LR images and compared by the full-reference image quality measures, as summarized in Table 3.
As shown in Figures 12-15, the interpolation-based SR methods cannot successfully recover the details in the image. Since they are not sufficiently close to the unknown HR image, their NIQE and BRISQUE values are high, whereas example-based SR methods generate unnatural artifacts near the edge because of the inappropriate training dataset. Patch-based and the proposed SR methods provide better SR results.

Conclusions
In this paper, we presented a multisensor super-resolution (SR) method using directionally-adaptive regularization and multispectral image fusion. The proposed method can overcome the physical limitation of a multispectral image sensor by estimating the color HR image from a set of multispectral LR images. More specifically, the proposed method combines the directionally-adaptive regularized image reconstruction and a modified multiscale non-local means (NLM) filter. As a result, the proposed SR method can restore the detail near the edge regions without noise amplification or unnatural SR artifacts. Experimental results show that the proposed method provided a better SR result than existing state-of-the-art methods in the sense of objective measures. The proposed method can be applied to all types of images, including a gray-scale or single-image, RGB color and multispectral images.