No-Reference Blurred Image Quality Assessment by Structural Similarity Index

: No-reference (NR) image quality assessment (IQA) objectively measures the image quality consistently with subjective evaluations by using only the distorted image. In this paper, we focus on the problem of NR IQA for blurred images and propose a new no-reference structural similarity (NSSIM) metric based on re-blur theory and structural similarity index (SSIM). We extract blurriness features and deﬁne image blurriness by grayscale distribution. NSSIM scores an image quality by calculating image luminance, contrast, structure and blurriness. The proposed NSSIM metric can evaluate image quality immediately without prior training or learning. Experimental results on four popular datasets show that the proposed metric outperforms SSIM and well-matched to state-of-the-art NR IQA models. Furthermore, we apply NSSIM with known IQA approaches to blurred image restoration and demonstrate that NSSIM is statistically superior to peak signal-to-noise ratio (PSNR), SSIM and consistent with the state-of-the-art NR IQA models. image blurriness metric has positive correlation with the difference mean opinion scores (DMOS), i.e., the subjective quality scores in CSIQ and LIVE II datasets, and negative correlation with DMOS those belong to IVC and TID2013 datasets. It can be concluded that our blurriness metric can precisely represent the blurriness of natural images and ﬁt the subjective quality assessment of the HVS.


Introduction
Advances in digital techniques enable people to capture, store and send a large amount of digital images easily, which sharply accelerates the rate of information transfer. Images captured by cameras in the real world are usually subject to distortions during acquisition, compression, transmission, processing, and reproduction. The information that an image conveys is related to image quality, thus image quality is very significant for image acquisition and processing systems. If the quality of an image is bad, its processing result is usually bad as well. Therefore, image acquisition and processing systems in real applications need image quality assessment (IQA) to objectively and automatically identify and quantify these image quality degradations. In recent decades, many IQA methods have been proposed to solve this problem. IQA methods can be classified as full-reference (FR), no-reference (NR) and reduced-reference (RR) methods. FR methods, such as the mean squared error (MSE), the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) [1], require the original undistorted images as the references. In addition, RR methods need prior information about the original images, but either the original undistorted images or their prior information are rarely obtained in practice. NR methods can assess the quality of the distorted images only using themselves, thus are more suitable for actual applications. In this paper, we confine our work to NR methods and focus on the degradation of blur.
The popular FR metric SSIM [1] takes advantage of mathematical convenience and matching to known characteristics of the human visual system (HVS) but suffers from a lack of pristine reference images. Recent general NR models, such as [2][3][4], perform well when adapting with HVS, but lose computation convenience when training with distorted images in advance. In this paper, we propose an NR IQA method for blurred images. Our proposed metric, called no-reference structural similarity (NSSIM), is based on re-blur theory and SSIM, and can be regarded as an improvement of SSIM from FR to NR. The first part is the re-blur process, i.e., blurring the distorted images by a Gaussian low-pass filter. The second is to quantify the blurriness of the image by grayscale distribution. Finally we combine luminance, contrast, structure and blurriness to a quality score. Our NSSIM can be easily computed using the input image and its re-blurred image, and no previous training is needed. Experimental results from four popular datasets validate the better performance of NSSIM than the compared FR and NR methods. In addition, we apply NSSIM to evaluate the performance of the blurred image restoration. The results demonstrate that NSSIM performs as well as the state-of-the-art NR IQA metrics and can get better image quality evaluation than the existing FR IQA metrics PSNR and SSIM. The contribution of the paper is two-fold. One is that we propose a novel definition of image blurriness based on the grayscale distribution of the image, and verify its effectiveness of blurriness measurement and fitness with the subjective quality sense of a human. The other is that we extend the famous FR IQA metric SSIM to a no-reference manner, achieving state-of-the-art IQA performance without previous training.
The rest of this paper is organized as follows. In Section 2, we review previous works in IQA. In Section 3, we describe the blurriness definition and computation, and procedure of our model. In Section 4, we evaluate the performance of the proposed approach by comparing it with the state of the art and applying for blurred image restoration. Section 5 concludes the paper.

Related Works
FR IQA methods are usually used to get quantitative evaluation of image quality. For example, PSNR is a classical FR IQA that measures the difference between the maximum and minimum grayscale of the image, which is simple to achieve but cannot accurately simulate the HVS. Another popular FR metric SSIM [1] takes the advantage of mathematical convenience and matching to known characteristics of the HVS but suffers from a lack of pristine reference images. To avoid the requirement of the referred undistorted images, NR IQA algorithms are researched for applications when the referred undistorted images are unavailable. According to their capability, NR IQA algorithms can be divided into distortion-specific and holistic models. In the following, we survey NR IQA algorithms that target blur, compression, and several holistically operated models.

Distortion-Specific NR IQA Algorithms
Distortion-specific NR IQA algorithms assume that the distortion medium is known. Popular blur IQA algorithms model edge spreads and relate these spreads to perceived quality. Sang [5] proposes a blur IQA model by using singular value decomposition (SVD) to evaluate image structural similarity. Caviedes [6] computes sharpness using the average 2D kurtosis of the 8 × 8 DCT blocks and spatial edge extent information. Ferzli [2] evaluates image quality by Just Notice Blur (JNB). Joshi [7] presents an NR IQA method based on continuous wavelet transform. Similarly, the general approach to NR JPEG IQA is to measure edge strength at block boundaries and relates this strength as well as possibly some measure of image activity to perceived quality. Feng [8] measures the visual impact of ringing artifacts for JPEG images. Meesters [9] detects the low-amplitude edges that result from blocking and estimating the edge amplitudes. Wang [10] evaluates image quality by designing a computationally inexpensive and memory-efficient feature extraction method and estimating the activity of the image signal. JPEG2000 ringing artifacts in an image are normally modeled by measuring edge-spread using an edge-detection-based approach. For example, Sazzad [11] computes simple features in the spatial domain, Sheikh [12] assesses image quality by natural scene statistics (NSS) models, and Marziliano [13] calculates edge width by finding the start and end positions of the edge of each corresponding edge in the processed image.

Holistic NR IQA Algorithms
Holistic IQA algorithms are designed for measuring distortion of unknown type. Holistic models extract common features of various distortions or establish various models for different distortions. BIQI [14] assumes an image is subjected to a wavelet transform over three scales and three orientations using the Daubechies 9/7 wavelet basis [15], and assesses image quality with a two-step framework which estimates the presence of a set of distortions and evaluated the quality of the image along each of these distortions. BLIIND-II [3] is a multiscale but single-stage algorithm through machine learning that operates in the DCT domain, where a number of features, i.e., scale and orientation selective statistics, correlations across scales, spatial correlation and across orientation statistics, are computed from a natural scene statistics (NSS) model of block DCT coefficients. BRISQUE [4] details the statistical of locally normalized luminance coefficients in the spatial domain. Considering images are naturally multiscale, BRISQUE captures 36 features from two grayscales to identify image distortions and activate distortion-specific quality assessment. A support vector regressor (SVR) [16] is used to build a regression module to perceive quality score. NIQE [17] is founded on perceptually relevant spatial domain NSS features extracted from local image patches that capture the essential low-order statistics of natural images. NIQE takes the distance between the quality-aware NSS feature model and the MVG fit to the features extracted from the distorted image as quality score.
The up-to-date NR IQA algorithm NR-CSR [18] applies convolutional sparse representation (CSR) to simulate the entire image as a sum over a set of convolutions of coefficient maps, which has the same size as the image. NR-CSR uses a low-pass filter to obtain the sparse coefficient and calculating the gradient value to score the sharpness. Meanwhile, artificial neural network method has already been used by novel NR IQA algorithms. For example, Fan [19] proposed an NR IQA algorithm based on multi-expert convolutional neural networks (MCNN) and Li [20] proposed an IQA model for realistic blur image-based semantic feature aggregation (SFA).
It should be noticed that the state-of-the-art NR IQA algorithms like BRISQUE and BLIIND-II perform well at adapting HVS but lose computation convenience due to the training by distorted images in advance. Moreover, deep learning methods like MCNN suffer heavy time cost and high-level hardware requirements.

The Proposed IQA Metric
Our proposed NR IQA metric for blurred images aims to be applied without advance training using distorted/undistorted images. It can be regarded as an improvement of SSIM from FR to NR by using re-blur theory, thus we call it NSSIM. In this section, we describe the detail of NSSIM in four parts. Firstly, we revisit the FR structural similarity index (SSIM), and then introduce re-blur, which gives the twice-blurred image. The following part describes feature extraction, where we define image blurriness d from image grayscale histogram distribution. Finally, we compare luminance, contrast, structure and blurriness between the original image and its re-blurred image to land a quality score, i.e., our NSSIM.

Structural Similarity
The FR metric SSIM [1] compares luminance, contrast and structure of distorted image x and pristine reference image y, i.e., where l(x, y), c(x, y) and s(x, y) respectively represent luminance, contrast and structure comparison functions. α > 0, β > 0 and γ > 0, which are parameters to adjust the relative weight of the three components.
Weber's Law [21] indicates that the magnitude of a just-noticeable luminance change I is approximately proportional to the background luminance I for a wide range of luminance values. Thus the luminance comparison function is defined as where the total number of image pixels. x i and y i are single pixels in x and y) represent the mean intensity of image x and image y respectively. C 1 is a positive constant to avoid instability when µ 2 x + µ 2 y is very close to zero. C 1 = (K 1 L) 2 , K 1 1 is a small constant, and L is the dynamic range of the pixel values (e.g., 255 for 8-bit grayscale images).
Similarly, the contrast comparison function is are standard deviations of x and y, respectively. The definition of structure correlation function is where C 3 is a small positive constant to avoid instability when σ x σ y closes to zero,

Re-Blur
It is known that sharp images contain more high-frequency components, thus grayscale variations between adjacent pixels in sharp images are more distinct than those in blurred images. The re-blur theory [17] explains that the variation of quality of sharp images would be larger than that of blurred images after blur processing, which is also demonstrated in Figure 1. Considering the input image x, we can take re-blurred image y as the reference, and the image quality can be assessed by the quantity of high frequency components measured between x and y. The re-blur procedure is shown in Figure 2. Considering Gaussian blur as the distortion type in this paper, we apply a Gaussian kernel k g (i.e., a Gaussian low-pass filter) to the distorted image x to obtain a re-blurred image y, which is formulated as where * is the convolution operator, and the Gaussian kernel k g is sampled from a two-dimensional Gaussian function It should be noted that k g is parameterized by the kernel size and the standard deviation σ. The impact of these parameters will be discussed in Section 4.
We decompose an M × N × 3 image into the high-frequency part I H and the low-frequency part I L . I H represents the drastic part while I L represents the mild part. Thus the image grayscale variation would be sharper in drastic-change area. In order to focus on the main part that contributes to image quality and reduce the time cost, a down-sampling process for the inputted multi-dimensional image by a simple low-pass filter f = max(1, round(min(M, N)/256)) is applied. If f > 1, we define . Values outside the bounds of the image are computed by mirror-reflecting the image across the border. The filtered image should have the same size of the original image by restricting the points of filtering template in the original image. In addition, we convert both the multi-dimensional original image and the filtered, i.e., the re-blurred image into two-dimensional mode.
x y

Feature Extraction
Considering the abundant information that images possess, the complexity of images can be represented based on their structure, noise and diversity [22] or based on fuzzy measures of entropy [23] or based on discrete wavelet transform decomposition [24], etc. In this paper, to represent the inputted image and the re-blurred image, we extract luminance, contrast, structure and blurriness of both the down-sampled 2D images, respectively. Then, we ameliorate the traditional structural similarity by combining the blurriness with luminance, contrast and structure in Figure 3. We would like to emphasize image blurriness in this subsection.
Image histogram reports grayscale distribution. As seen in Figure 4, we find that sharp images have a broader grayscale range. In contrast, grayscale distributions of blurred images are narrower and tend to approach the mean value according to their histograms. As is shown in Figure 4, grayscales close to the mean value µ of the blurred image (e.g., pixels whose grayscale close to the 132.92 in the right histogram in Figure 4) take the most proportion in the histogram. Thus we describe image blurriness by distributing different weights to different pixel values. We assign heavy weights to the pixels those close to image mean grayscale value and fewer weights to those away from that. Thus we define the image blurriness as where d represents image blurriness, g i is gray value whose range varies from 0 to the dynamic range L (e.g., L = 255 for 8-bit image), p(g i ) is the proportion of g i on the whole image, and w(g i ) represents the weight of g i , which can be calculated as According to the proposed image blurriness index, we test images in Figure 1 to verify the impact of re-blur process. Using SSIM and image blurriness as image quality indexes, the results are shown in Table 1. It is shown that the quality of a sharp image has a more drastic decline than that of a blurred image. Then we analyze the image quality trend activated by re-blur, i.e., we blur the sharp image of Figure 1 with blur times varying from 0 to 11. Experimental results are shown in Figure 5. The proposed blurriness index has an approximate linear growth while the SSIM index gradually declined slower with the increase of blur times. This demonstrated that a high-quality image has a relatively small SSIM index when taking its blurred image as reference.
In order to verify the consistency between the proposed image blurriness metric and natural blur images, we selected five Gaussian blur images from each of the four datasets, including CSIQ [25], Live II [26], IVC [27] and TID2013 [28], shown in Figure 6. Table 2 indicates that the proposed image blurriness metric has positive correlation with the difference mean opinion scores (DMOS), i.e., the subjective quality scores in CSIQ and LIVE II datasets, and negative correlation with DMOS those belong to IVC and TID2013 datasets. It can be concluded that our blurriness metric can precisely represent the blurriness of natural images and fit the subjective quality assessment of the HVS.     Table 2. Blurriness of four sets of images shown in Figure 6, demonstrating the consistency between the proposed image blurriness metric and DMOS, i.e., the subjective scores given by the datasets.

NSSIM Index
Similar to the definitions of luminance, contrast and structure in SSIM, we define blurriness comparison function as where d x and d y respectively represent blurriness of distorted image x and its re-blurred image y. C 4 is to avoid instability when d 2 x + d 2 y closes to zero. Thus, we can calculate luminance, contrast, structure and blurriness to get a new metrics SSIM r as where λ is the exponent coefficient of h(x, y). To better capture the blurriness in local areas of an image, we partition an image into P × P patches of the same size and compute the mean SSIM r as where M = P × P, and x i and y i are the i-th patches in x and y respectively. Finally, to make the quality score in accordance with the subjective impression, i.e., high-quality image gets high IQA score, we define our proposed NSSIM index as

Datasets
We tested the proposed NSSIM on four popular datasets for IQA, including CSIQ [25], LIVE II [26], IVC [27] and TID2013 [28]. All datasets consist of several subsets of different distortion types. In this paper, we used Gaussian blur distortion for experiments. In particular, the CSIQ dataset contains 30 reference images and 150 Gaussian blur images, the LIVE II dataset contains 29 reference images and 145 Gaussian blur images, the IVC dataset contains four reference images and 20 Gaussian blur images, and the TID2013 dataset contains 25 reference images and 125 Gaussian blur images. All blur images in these datasets were used together with their DMOS as subjective quality scores. Some samples of Gaussian blur images for experiments are shown in Figure 6 together with their DMOS. It should be noted that DMOS scores provided by CSIQ [25] and LIVE II [26] have positive correlation with blurriness scores, while DMOS scores provided by IVC [27] and TID2013 [28] change in the opposite direction. As shown in Figure 6, the datasets used for performance evaluation contain blur images with different kinds of nature scenes which are generally distributed, thus are suitable for statistical analysis of evaluation results.

Indexes for Evaluation
In order to provide quantitative measurement of the performance of our proposed model, we follow the performance evaluation procedures employed in the video quality experts group (VQEG) [29]. We test the proposed IQA metrics using Spearman's rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. SROCC, PLCC and RMSE are respectively defined as where N is the number of images, s i and p i represent the i-th scores given by subjective evaluation and objective evaluation, s and p represent mean subjective quality score and mean objective predicted score, and R s i and R p i represent the rank order number of s and p, respectively. SROCC is a nonparametric measure of rank correlation to statistically assess how well the relationship between the subjective quality score and the objective predicted score can be monotonically described, while PLCC is a measure of the linear correlation between them. RMSE represents the square root of the quadratic mean of the differences between the subjective quality scores and the objective predicted scores, and is a frequently used measure. By using these three statistical measures, we can easily analyze the consistency between the subjective quality score and the objective predicted score, which indicates the capability of IQA methods.

Parameter Setting
In this subsection, the re-blur method including blur type and filter parameters are discussed. We also discuss the exponent coefficients of luminance, contrast, structural and blurriness, i.e., α, β, γ and λ in Equation (10), to achieve the best performance of our proposed metric.

Filter Type and Parameter for Re-Blur
As mentioned in Section 3.2, we apply a Gaussian low-pass filter to get the re-blurred image of the input image. It should be noted that we compared three types of filters for re-blur, i.e., Gaussian blur, motion blur and defocus blur. The results on LIVE II dataset are illustrated in Table 3. We can see that Gaussian blur leads to the highest SROCC. It is easy to understand such results, noting that the filter for re-blur is of the same type as the distortion of the image in LIVE II dataset. Thus we chose Gaussian filter for re-blur in the experiments. Furthermore, the size and deviation of Gaussian filter will impact running time. According to the experimental results shown in Table 4, we take the Gaussian filter of 11 × 11 with deviation 1.5.

Patch Quantity
Since we partitioned an image into P × P patches when calculating NSSIM, the size and amount of the patches have impact on processing time. In Table 5, we list the running time taken (in seconds) to compute each quality on an image of resolution 768 × 512 and 24-bit deep color from LIVE II dataset on a 2.6 GHz single-core PC with 4 GB of RAM. When P = 16, each patch is 48 × 32 × 3 and it takes average 2.4373 seconds to evaluate, which is the least running time. Thus, we set P = 16 in the experiments in this paper.

Exponent Coefficient of Blurriness Comparison Function
In this section, we evaluate the contribution of λ, which is the exponent coefficient of blurriness comparison function h(x, y). The impact of λ tested on LIVE II [26] dataset is shown in Figure 7 with other parameters the same as SSIM [1]. SROCC, PLCC and RMSE come to the best performance when λ = 1, and when λ = 0 our NSSIM degrades to the traditional SSIM [1]. To achieve the best performance and simplify the expression, we set α = β = γ = λ = 1, C 1 = 0.01, C 2 = 0.03,

Comparison with the State-of-the-Arts
We compared the performance of NSSIM against PSNR, original SSIM [1], and several state-ofthe-art NR IQA models such as BRISQUE [4], BLIIND-II [3], MCNN [19], IQA-CWT [7], SFA [20], etc. In order to evaluate the statistically significant difference between the proposed metric and the existing IQA algorithms, we performed statistical analysis by paired-sample t-tests and reporting the p-values. The null hypothesis in our t-tests is that the pairwise difference between the proposed metric and others has a mean equal to zero, i.e., the differences in performances presented in the results are not statistically significant. p-values < 0.05 indicates the rejection of the null hypothesis at the 5% significance level, meaning that the differences are statistically significant. It should be noticed that results of some IQA algorithms, such as NR-CSR [18], MCNN [19], IQA-CWT [7], and SFA [20], are collected from the corresponding references, thus their p-values are not shown in the result tables. As seen from Table 6, NSSIM achieves 0.8971 of PLCC and 0.1266 of RMSE on CSIQ [25] dataset, better than SSIM and MCNN metrics. We randomly sampled 100 images from CSIQ dataset by 10 times for the t-test, so that 10 samples of SROCC, PLCCs and RMSE were achieved for each algorithm in comparison. p-values in Table 6 show that t-tests reject the null hypothesis at 5% significance level, i.e., the alternative hypothesis is accepted that the pairwise difference between NSSIM and the other metrics does not have a mean equal to zero. This ascertains the differences of various metrics are the statistically significant. Table 6. Comparison with ten existing IQA algorithms on CSIQ dataset (150 Gaussian blur images). We take Spearman's rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm SROCC (p-Values) PLCC (p-Values) RMSE (p-Values)
PSNR 0.7180 (0.000031) 0.6427 (0.000004) 1.6266 (0.000001) SSIM [1] 0.8727 (0.005351) 0.8570 (0.000026) 1.8174 (0.000522) BIQI [14] 0.9119 (0.004789) 0.7144 (0.000756) 2.0347 (0.000011) BRISQUE [4] 0.9710 (0.010362) 0.9680 (0.004423) 1.5331 (0.001605) TIP [2] 0.9011 (0.009542) 0.8811 (0.007826) 1.4768 (0.001412) NIQE [17] 0.9721 (0.017612) 0.9561 (0.010149) 0.1884 (0.005590) BLIIND-II [3] 0.9177 (0.001637) 0.9111 (0.000488) 0.8750 (0.007369) MCNN [19] 0.9358 (-) 0.9459 (-) 6.4538 (-) IQA-CWT [7] 0.9169 (-) -(-) 7.8650 (-) SFA [20] 0.9166 (-) 0.8305 (-) 0.7055 (-) NSSIM 0.9464 0.9689 0.8669 Table 8. Comparison with eight existing IQA algorithms on IVC dataset (20 Gaussian blur images). We take Spearman's rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface. As seen from Table 9, the SROCC of NSSIM is 0.8995, which beats other FR or NR IQA algorithms on TID2013 [28] dataset. A paired-sample t-test was performed as we done on CSIQ and LIVE II dataset. The p-values give the demonstration of the robust superior SROCC performance of NSSIM than those of the other algorithms. Table 9. Comparison with eight existing IQA algorithms on TID2013 dataset (125 Gaussian blur images). We take Spearman's rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface. Furthermore, Table 10 gives means and standard deviations of SROCC, PLCC, and RMSE of tested IQA algorithms on four datasets. Only IQA algorithms tested on all the four datasets are collected. It can be seen from Table 10 that our NSSIM metric achieves the highest mean SROCC and the third best of mean PLCC and RMSE. Meanwhile the t-test results shown in Table 10 led to the acceptance of the null hypothesis at the 5% significance level, indicating that the differences of average performance between NSSIM and the other metrics are not statistically significant on various datasets. This is understandable since our NSSIM cannot achieve significant improvement on every dataset in terms of all indexes. However, considering the differences of image complexity in different datasets, such as image size, contrast and diversity, the experimental results validate that the proposed NSSIM can be adapted to different image categories. Noticing that [3,4,17] all need prior training procedure, our NSSIM performs the best to maintain the balance of IQA and time-efficiency. Table 10. Average IQA performance (mean ± standard deviation) on four datasets. We take Spearman's rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface. These test results also validate that NSSIM is a demonstration of the relationship between quantified image naturalness and perceptual image quality. NSSIM establishes a simple method to identify image quality without reference or prior training on human judgments of blurred images. Besides, compared with up-to-date NR IQA metrics NR-CSR [18], MCNN [19] and SFA [20], NSSIM is less time costly since the needless of training or learning procedure.

Consistency with Subjective DMOS Scores
We analyzed the consistency between NSSIM scores and the subjective DMOS scores on four datasets. Scatter plots of NSSIM and DMOS are shown in Figure 8. For CSIQ and LIVE II datasets, our NSSIM has negative correlation with DMOS because NSSIM has positive correlation with image blurriness while DMOS has negative correlation with image blurriness. For IVC and TID2013 datasets, NSSIM and DMOS both have positive correlation with image blurriness. The experimental results demonstrate that our NSSIM has good consistent to HVS, thus can be used for IQA effectively.

IQA for Blurred Image Restoration
The purpose of image restoration is to reduce or erase image degeneration during acquisition, compression, transmission, processing, and reproduction. IQA can be used to evaluate image restoration algorithm by assessing qualities of distorted image and restoration image. Sroubek [30] presented a deconvolution algorithm for decomposition and approximation of space-variant blur using the alternating direction method of multipliers. Kotera [31] proposed a blind deconvolution algorithm using the variational Bayesian approximation with the automatic relevance determination model on likelihood and image and blur priors. In this section, we use the proposed NSSIM to evaluate the performance of image restoration. Two groups of images including original image, blurred image and restorations are evaluated by the proposed IQA algorithm NSSIM and PSNR, SSIM and several state-of-the-art NR IQA algorithms. The experimental results are shown in Figures 9 and 10 and Tables 11 and 12.  . Group 1 restorations: The original image is 480 × 720 × 3 which is provided by LIVE II dataset while the blurred image is produced by a Gaussian low-pass filter of 11 × 11 with deviation 1.5. The restorations are produced by Sroubek [30] and Kotera [31], respectively. Figure 10. Group 2 restorations: The original image is 512 × 512 × 3 which is provided by IVC dataset while the blurred image is produced by a Gaussian low-pass filter of 11 × 11 with deviation 1.5. The restorations are produced by Sroubek [30] and Kotera [31], respectively. It can be seen from Figure 9 and Table 11 that PSNR and SSIM [1] fail to evaluate the quality of both restorations since the quality scores are smaller than that of the blurred image. BIQI [14] and NIQE [17] succeed to identity the restoration images but the difference between the two restorations is slight. While BRISQUE [4], BLIIND-II [3], SFA [20] and the proposed metrics NSSIM achieve better precision and the differences between two restorations are distinct. Moreover, the NSSIM-predicted score of blurred image is 0.0709 × 10 −4 , showing obvious variance between the blurred image and the original image. This demonstrates that NSSIM is extremely sensitive to blur. Similarly, Figure 10 and Table 12 also demonstrate that NSSIM is suitable for blur IQA.

Conclusions
IQA is important and useful for image acquisition and processing systems in many applications. In this paper, we focus on blurred IQA. We have proposed a novel NR IQA metric called NSSIM based on SSIM and re-blur theory. The proposed NSSIM takes the advantage of SSIM in mathematical convenience and expanded it from FR to NR. We blur the distorted image and take the re-blurred image as a reference. The definition of image blurriness is given by evaluating grayscale distribution. We score image quality by taking four parts of image features into consideration, including luminance, contrast, structural and blurriness. We discussed the impact of parameters of our algorithm on the performance. We tested the proposed NSSIM metric on four datasets. The experimental results show that NSSIM achieves promising performance and has good consistency of HVS. Compared to existing IQA models, NSSIM does not need reference or prior training or learning procedure, which makes it more time-efficient and convenient to apply. We also expanded the proposed metric to IQA for image restoration, which proves our metric is practically useful. We believe that NSSIM has a great potential to be applied in unconstrained environments.