Real-time monitoring the performance of imaging equipment is important in practical applications such as environmental monitoring and resources exploration [
1,
2]. In particular, the imaging performance of on-orbit space remote sensors can only be assessed via processing and analyzing the images transmitted from the satellite. Like natural images, remote sensing images also suffer from the degradations caused by the imaging system (as shown in
Figure 1). The most salient and primary impact of image degradation is the decrease of image definition, that is, the decrease of visual perception effect which influences the subsequent remote sensing image interpretation. Therefore, remote sensing image quality assessment (RS-IQA) becomes helpful. Current solutions for evaluating the performance of remote sensors include the Target method [
3], the Knife-edge method [
4], the Pulse method [
5], and others. However, in the Target method, not all on-orbit space remote sensors can obtain target images; and in the Knife-edge method and the Pulse method, it is challenging to provide every image returned with effective Knife-edges or pulses. Except for harsh imaging conditions, such deficiencies cause these methods to not be feasible or generalizable to other types of remote sensors. At present, it lacks normative approaches to assess the imaging performance of remote sensors. Thus, it is urgent to develop a universal method to evaluate the imaging performance of remote sensors via RS-IQA.
1.1. Related Works
Current approaches to assess the performance of remote sensors typically include those in references [
3,
4,
5]. To the best of our knowledge, there is no IQA method dedicated to RS images, and thus, the development of IQA methods for natural images is reported. In general, IQA methods can be categorized into two parts—subjective assessment methods and objective assessment methods. Due to the fact that the subjective way of monitoring the image quality by humans is of great cost and low efficiency, in the past few decades, the growing demand for objective assessment methods in practical applications has become prominent and urgent. Objective IQA tasks can be divided into three categories—full-reference IQA (FR-IQA), reduced-reference IQA (RR-IQA), and no-reference IQA (NR-IQA)—among which NR-IQA is the most common and challenging method. In the case of NR-IQA, we can only accomplish the IQA task with degraded images, since the pristine images are usually unavailable. Although a great number of IQA algorithms in the last two decades have emerged to achieve a common goal, which is to conform the computational evaluation to human perception, they can only cover limited application requirements we usually meet in practice. Hence, there are still huge potentials and an important gap that needs to be filled in NR-IQA issues.
Early NR-IQA models commonly operate under the hypothesis that images are degraded by particular kind or several specified kinds of distortions [
6,
7,
8,
9], which requires a priori knowledge of the image distortion types. Limited by the selection of distortion types, such algorithms depending on the priori cannot achieve further progress. Later, a new NR-IQA class with no demand for prior knowledge of distortion types, labeled as blind image quality assessment (BIQA), appeared. The main idea of BIQA is training a model on the database that consists of distorted images associated with subjective assessment scores. A typical model is the Blind Image Quality Index (BIQI) [
10]. With a pre-trained estimation model of distortion types, first, the BIQI method extracts the scene statistics from a given test image. Then, these statistics are used to determine which distortion type(s) the image suffered from, and the final image quality score is computed based on the extracted scene statistics and the pre-judged distortion types. The BIQI model was later extended to the DIIVINE [
11] model. The improvement lies in its adoption of a more abundant set of natural scene statistics. Beside of BIQI and DIIVINE, Saad et al. successively proposed two models called BLINDS [
12] and BLINDS-II [
13]. Both methods can be simplified to learn a probabilistic model from the natural scene statistics-based feature set, and the difference between them is the computational complexity of feature extraction. Moreover, Mittal et al. proposed a model called BRISQUE [
14], where they applied locally normalized luminance coefficients to estimate the loss of naturalness of the degraded image and gave a final image quality assessment score on the basis of the loss measurement. Ye et al. [
15] proposed an unsupervised feature learning framework for BIQA namely CORNIA, which operates under a coding manner and realizes a combination of feature and regression training. CORNIA was later refined to the semantic obviousness metric (SOM) [
16], where object-like regions are mainly detected and processed. In [
17], another new BIQA model named DESIQUE was presented, which adopts features in both spatial and frequency domain.
However, all these approaches share the common problem of weak generalization ability [
18]. Specifically, these models need to be trained on certain distorted image database(s) to learn a regression model. When applied to other different databases, they show rather weak performance [
18]. What is more, since the distortion types are changeable and numerous in the real world and an image can suffer from a single or multiple distortions, it is impossible for a BIQA algorithm to train on a database perfectly containing all such distortion types. In other words, in no way can we acquire complete prior knowledge of image distortion types, which will inevitably result in the poor generalization ability of such algorithms. Therefore, it is of great significance to develop more general and more practical BIQA methods.
The Natural Image Quality Evaluator (NIQE) model [
19] possesses better generalization ability. The NIQE model needs to be first trained on a corpus of high-quality images to learn a reference multivariate Gaussian (MVG) model. Then, with a given test image, the NIQE model extracts an NSS-based feature set and fits the feature vectors to an MVG model. Finally, the overall quality of the test image is predicted by measuring the distance between its MVG model and the reference model. However, this method may cause a loss of useful information. Since only one MVG model is used to characterize the test image, some local information of the image is neglected. To tackle this problem, Zhang et al. proposed a new model—IL-NIQE [
18]. The IL-NIQE model partitions a test image into patches and extracts an enriched feature sets from each patch. Therefore, a set of MVG models is obtained and the final image quality score is computed by an averaging pooling. Proposed by Bosse et al. [
20], a purely data-driven end-to-end deep neural networks for NR-IQA and FR-IQA also takes the advantage of local information and extracts the deep feature with 10 convolutional layers and five pooling layers and performs regression with two fully connected layers. Recently, a new method called MUSIQUE [
21] was proposed, which is different from the previous methods that was only applicable to singly distorted images. The MUSIQUE model was designed for both single and multiple distortions applications through a way of estimating three distortion parameters with the NSS-based features and mapping the parameters into an overall image quality score.
However, we observed that these methods obtained unsatisfying results when applied to images with the content of remote sensing scenes (refer to
Section 5 for more information), which shows a weak generalization ability to diverse tasks. Therefore, in this work, we seek a method to efficaciously evaluate the quality of images for both real-life natural scenes and RS scenes.
Note that, in this paper, the natural image represents the images of real-life scenes, and the remote sensing image represents that of RS scenes, so as to differentiate the image content as well as to conform to the RGB image format of these two kinds of images.
1.2. Our Contributions
In this paper, we propose a general-purpose BIQA model for real-life scenes as well as remote sensing scenes namely BM-IQE. Inspired by [
18] and based on IL-NIQE, we introduced an enriched feature bag (EFB) and a structural similarity block-matching (SSIM-BM) strategy to ensure the proposed method performs well on RS-IQA applications. Meanwhile, the proposed method can achieve competitive performance on natural images as compared to existing state-of-the-art BIQA methods. A general framework of the proposed BM-IQE model is shown in
Figure 2.
The contributions of our BM-IQE model are as follows:
Datasets for BIQA of the RS scene images are first constructed based on public scene datasets and simulated degraded images.
Imaging performance evaluation by means of image quality assessment of the remote sensor is first studied, and a new way of indirectly evaluating the imaging performance of remote sensors is presented.
We introduce a block-matching strategy to assist the image patch strategy. This operation can better express the intrinsic features of the image patches (such as affinity, correlation, etc.), as well as to make sure the quality prediction suffering less from image degradations. In this way, the image quality assessment can acquire higher efficacy and accuracy.
We adopt four classic Gray-level co-occurrence matrix (GLCM) statistics as texture features to ensure that our method can be more appropriately applied to remote sensing applications comparing with existing IQA models, therefore making sure that the proposed model has an enhanced universality and practicality.
We conducted an extensive series of experiments on various largescale public databases, including RS scene image datasets, singly distorted image, databases and multiply distorted image databases. Note that in many remote sensing applications, such as scene classification and target detection, because images with visible bands of red, green, and blue can fully present the color, texture and contour features of the land cover from the human visual perception aspect, people usually use RGB images for content understanding. Besides, the algorithm proposed in this paper is a general image quality assessment approach, which is oriented to a wide range of natural scene types including both RS scenes and common real-life scenes, so the RGB colors play a very fundamental and significant role as a low-level image feature to recognize objects and understand content. Thus, in this paper, only images with RGB channels are used for experiments, and overall the proposed algorithm was accordingly designed for RBG images of RS scenes and real-life scenes. Experimental results show that the proposed BM-IQE method outperforms other state-of-the-art IQA models on RS scenes and is highly efficacious and robust on real-life scenes.
The rest of this paper is organized as follows.
Section 2 introduces the block-matching strategy used in BM-IQE.
Section 3 introduces the features we adopt to predict image quality.
Section 4 illustrates how the proposed new model is designed.
Section 5 presents the experimental results and
Section 6 presents the general conclusions of this paper.