No-Reference Image Quality Assessment with Global Statistical Features

The perceptual quality of digital images is often deteriorated during storage, compression, and transmission. The most reliable way of assessing image quality is to ask people to provide their opinions on a number of test images. However, this is an expensive and time-consuming process which cannot be applied in real-time systems. In this study, a novel no-reference image quality assessment method is proposed. The introduced method uses a set of novel quality-aware features which globally characterizes the statistics of a given test image, such as extended local fractal dimension distribution feature, extended first digit distribution features using different domains, Bilaplacian features, image moments, and a wide variety of perceptual features. Experimental results are demonstrated on five publicly available benchmark image quality assessment databases: CSIQ, MDID, KADID-10k, LIVE In the Wild, and KonIQ-10k.


Introduction
As digital media takes a more central part in our daily lives and work, research on image and video quality assessment becomes more and more important. In many cases, the visual quality has to be optimized for content, like movies and sport games. In these cases, automatic assessment methods should take the actual image or video content into account to give the viewer the best experience. In medical imaging, a poor image quality may mean a misdiagnosis.
There are two ways of measuring image quality [1]. The obvious way is to ask people to give their opinions on a number of test images which is called subjective quality assessment. However, such a procedure can be very time-consuming and expensive to set up the experimental environment. That is why, objective image quality assessment has become a hot research topic because it deals with mathematical models and algorithms that are able to assess perceptual quality of digital images automatically. In the literature, objective image quality assessment algorithms are grouped according to the availability of the reference, pristine image. Specifically, full-reference image quality assessment (FR-IQA) algorithms possess full information for both the distorted image and the reference image, while no-reference image quality assessment (NR-IQA) methods predict perceptual quality exclusively based on the distorted image. Reduced-reference image quality assessment (RR-IQA) represents a middle course because it possess partial information about the reference image and full information about the distorted image.

Related Work
NR-IQA has gained a lot of attention in the recent decades. Although, the reference image is not available for NR-IQA algorithms, they can make assumptions about the distortions present in a given input image. Hence, they can be divided into distortionspecific and general-purpose groups. As the name indicates, distortion-specific methods assume the presence of certain distortions, such as JPEG [2] or JPEG2000 [3] compression noise. In contrast, general-purpose algorithms do not restrict themselves to specific distortions. General-purpose methods can be further divided into opinion-unaware [4,5] and opinion-aware [6] classes. Opinion-unaware ones do not require subjective quality scores in the training process, while opinion-aware algorithms usually rely on different regression frameworks trained on subjective scores.
Models based on natural scene statistics (NSS) have been very popular in opinionaware NR-IQA. The main idea is that pristine (distortion-free) images obey certain statistical regularities and distorted images' statistics deviate significantly from these regularities. As a consequence, these models contain three distinct stages: (1) feature extraction, (2) NSS modeling, and (3) regression. Hence, the main differences between NSS-based algorithms are connected to the above-mentioned three steps. For instance, blind image quality index (BIQI) [7] extracts features in the wavelet domain over three scales and three orientations. Moreover, generalized Gaussian distribution is fitted to the sub-band coefficients and the fitting parameters are utilized as quality-aware features. Finally, a trained support vector regressor (SVR) is applied to map features onto perceptual quality scores. In contrast, blind image integrity notator using DCT statistics (BLIINDS) [8] utilizes the statistics of local DCT coefficients. On the other hand, the mapping from features to quality scores is carried out by probabilistic prediction algorithms. In contrast, Liu et al. [9] utilized the orientation information from curvelet transform to determine correlation between scale and orientation energy distributions. Similarly to [7], an SVR is used to map the feature vectors onto quality scores. Gu et al. [10] combined NSS-based features with the free energy principle. He et al. [11] integrated NSS-based features and sparse representation. Mittal et al. [6] extracted NSS-based features from the spatial domain. Namely, mean subtracted contrast normalized (MSCN) coefficients were first determined from the raw pixel data. Subsequently, a generalized Gaussian distribution was fitted to MSCN coefficients. Moreover, an asymmetric generalized Gaussian distribution was also fitted to the products of neighboring MSCN coefficients. Similarly to BIQI [7], the fitting parameters were considered as quality-aware features and mapped to perceptual quality scores with a SVR. In [12], NSS features from multiple domains were combined. In contrast, Jenadeleh and Moghaddam [13] estimated the parameters of NSS features by a Wakeby distribution model.
Another line of papers extracts directly quality-aware statistical features from images and maps them to quality scores. Zhang et al. [14] generated quality-aware features from the joint generalized local binary pattern statistics. In contrast, Li et al. [15] proposed a gradient weighted histogram of local binary patterns for quality aware features. In [16], a set of quality aware statistical features (first digit distribution in the gradient magnitude and wavelet domain, color statistics) were combined with powerful perceptual features (colorfulness, global contrast factor, entropy, etc.) to train an Gaussian process regression (GPR) algorithm for quality prediction.
Recently, convolutional neural networks have become a prominent technology in the field of image processing. The deployment of CNNs in NR-IQA is gaining a lot of attention due to their representational power. Usually, CNNs consist of four types of components, such as convolutional, activation, pooling, and fully-connected layers stacked on each other. On the other hand, features extracted from CNNs trained on huge databases, such as ImageNet [17], have shown excellent representational power in many image processing tasks [18][19][20]. First, Kang et al. [21] proposed a CNN-based solution for NR-IQA. Specifically, the authors trained a CNN regression framework on 32 × 32 non-overlapping image patches. The perceptual quality of the overall input image was determined by pooling the patches' quality scores. Later, the proposed architecture was developed further by Kang et al. [22] to simultaneously estimate perceptual quality and image distortion types. Similarly, Kim et al. [23] proposed a regression CNN framework, but FR-IQA behavior was first imitated by generating a local quality map. Namely, the patches were first regressed onto quality scores obtained by a traditional FR-IQA metric. In contrast, Bianco et al. [24] applied fine-tuned AlexNet [25] to extract deep features from 227 × 227-sized image patches.
Zeng et al. [26] developed the approach of [24] further. Namely, they extracted features with the help of a ResNet [27] architecture and elaborated a probabilistic representation of distorted images. In [28], an Inception-V3 [29] network was utilized as feature extracted and it was pointed out that considering the features of multiple layers is able to improve the performance of perceptual quality prediction. In contrast, Liu et al. [30] trained a Siamese CNN to rank images in terms of perceptual quality. Subsequently, the trained Siamese CNN was used to transfer knowledge into a traditional CNN. Lin and Wang [31] proposed a quality-aware generative network for reference image generation. To this end, a qualityaware loss function was also proposed. Moreover, the knowledge about the discrepancy between real and generated reference images was incorporated into a regression CNN which estimated the perceptual quality of distorted images.
Another line of NR-IQA algorithms focuses on combining the results of existing methods to improve prediction performance [32,33]. For instance, Ieremeiev et al. [34] trained a neural network on the results of eleven different NR-IQA algorithms to boost performance.

Contributions
In this study, an NR-IQA method is presented which relies on a novel feature vector containing a set of quality-aware features that globally characterizes the statistics of a given input image to be assessed. Specifically, the proposed feature vector partially improves further our previous work [16]. A set of shape descriptors is proposed to the local fractal dimension distribution and first digit distribution feature vectors to capture better image distortions. Moreover, we point out that besides the wavelet coefficients [16], discrete cosine transform coefficients and singular values of an image are also suitable to derive first digit distribution features based on Benford's law. Motivated by the model of extended classical receptive field (ECRF), Bilaplacian quality-aware features are also incorporated into the introduced model. Unlike previous methods, the degradation of image edges are quantified by image moments. Experimental results and performance comparison to the state-of-the-art are presented on five publicly available IQA benchmark databases: CSIQ, MDID, KADID-10k, LIVE In the Wild, and KonIQ-10k.

Structure
The rest of this study is organized as follows. Section 2 describes our proposed method for NR-IQA. Next, Section 3 shows experimental results and analysis including the description of the applied IQA benchmark databases and evaluation protocol, a parameter study, and a comparison to other state-of-the-art algorithms. Finally, a conclusion is drawn in Section 4.

Proposed Method
The general overview of the proposed NR-IQA method is shown in Figure 1. As the workflow indicates, a set of feature vectors is extracted from the training images to train a machine learning model which is applied in the testing phase for mapping feature vectors into perceptual quality scores. In Section 3, a detailed parameter study is presented to find the most suitable regression using five IQA benchmark databases.
As pointed out by Ghadiyaram and Bovik [35], a various set of features is necessary to accurately predict artificially and authentically distorted digital images' perceptual quality. In this study, a novel set of quality-aware features is proposed that characterizes an image by taking into account its global statistics. The introduced method relies on a 132-dimensional feature vector including extended local fractal dimension distribution feature vector, extended first digit distribution (FDD) feature vectors, Bilaplacian features, image moments, histogram variances of relative gradient orientation (RO), gradient magnitude (RM), relative gradient magnitude (GM) maps, and perceptual features (colorfulness, sharpness, dark channel feature, contrast). The used features are summarized in Table 1 where quality-aware features proposed by this study are typed in bold.

Extended Local Fractal Dimension Distribution Feature Vector
In [41], Pentland demonstrated that natural scenes, such as mountains, trees, clouds, etc., can be described by fractal surfaces because fractals look like as natural surfaces. Various image distortions often change the local regularities of digital images' texture. Thus, distortions change the local fractal dimension distribution of a given test image. Consequently, the histogram of local fractal dimension distributions are quality-aware features [16]. As in our previous study [16], the local fractal dimension map of an image is created by considering each pixel in the original image as a center of a 7-by-7 rectangular neighborhood and the fractal dimension is calculated from this neighborhood. The boxcounting method is applied to determine the fractal dimension of an image patch because it is able to represent complexity and easy to implement [42,43]. Similarly to our previous work [16], a 10-bin normalized histogram was calculated considering the values between −2 and 3 from the local fractal dimension map. Figure 2 depicts the local fractal dimension maps of a reference-distorted image pair. It can be observed that distortions in texture appear very strongly in the local fractal dimension.
Although the normalized histogram of local fractal dimension distribution is able to describe the irregularities of natural scene, the following statics are attached to the normalized histogram to construct an effective feature vector: skewness, kurtosis, entropy, median, spread, and standard deviation. The skewness is determined as where v stands for the mean of v and std(v) is the standard deviation of v. The kurtosis is obtained as The entropy is obtained as where p i (v) stands for the histogram count of v.

Extended First Digit Distribution Feature Vectors
Benford's distribution concerns the leading digit (the first non-zero digit, range: 1-9) of values in a data set. Frank Benford published an article entitled "The law of anomalous numbers" in 1938 [44] where he analyzed the leading digit values from diverse sources, such as populations of counties, length of rivers, or death rates. Benford conjectured that the distribution of the leading digit x = 1, 2, . . . , 9 has probability mass function Those data sets, that follows the particular pattern defined by Equation (4) for their leading digits, are said to satisfy Benford's law. It was pointed out by Pérez-González et al. [45] that the luminance values of digital images do not satisfy Benford's law. However, the discrete cosine transform (DCT) coefficients of a digital image produces a good match with Benford's law [45].
In our previous work [16], we utilized the wavelet domain to obtain first digit distribution (FDD) feature vectors, since we pointed out that the FDD in the wavelet transform domain matches very well with the Benford's law prediction in case of distortion-free, pristine images. On the other hand, various image distortions result in a significant deviation from the prediction of the Benford's law in FDD. However, the discrete cosine transform (DCT) coefficients' and singular values' FDD shows similar properties to those of wavelet domain. In this study, normalized FDD feature vectors are extracted from the DCT coefficients [45] and the singular values, besides the wavelet transform domain. Moreover, the FDD distribution feature vectors are augmented by statistics as in the previous subsection, such as symmetric Kullback-Leibler divergence between the actual FDD and Benford's law prediction, skewness, kurtosis, entropy, median, spread, and standard deviation. As already mentioned, symmetric Kullback-Leibler (sKL) divergence is determined between the actual FDD (denoted by P(x), x = 1, 2, . . . , 9) and Benford's distribution (denoted by B(x), x = 1, 2, . . . , 9): where the Kullback-Leibler (KL) divergence is given as: In addition to sKL, skewness, kurtosis, entropy, median, spread, and standard deviation were also attached to the normalized FDD to obtain the extended FDD feature vector. As a result, an extended FDD feature vector has a length of 17. Moreover, extended FDD feature vectors are extracted from the horizontal, vertical, and diagonal wavelet coefficients, DCT coefficients, and singular values. Table 2 illustrates the average FDD of singular values in the KADID-10k [46] database with respect to the five different distortion levels found in this database. It can be observed that the sKL between the actual FDD and the Benford's distribution is roughly proportional with the level of distortion. Furthermore, the relative frequency of ones and twos are also roughly proportional with the level of image distortion. That is why, the FDDs in different domains were chosen as quality-aware descriptors and were extended with sKL and histogram shape descriptors, such as skewness, kurtosis, entropy, median, spread, and standard deviation.

Bilaplacian Features
Gerhard et al. [47] pointed out that the human visual system (HVS) is adapted to the statistical regularities in images. Moreover, Marr [48] emphasized the importance of studying zero-crossings at multiple scales to interpret the intensity changes found in the image. At the same time, the extended classical receptive field (ECRF) of retinal ganglion cells can be modeled as a combination of three zero-mean Gaussians at three different scales [49]. These are equivalent to a Bilaplacian of the Gaussian filter [49,50]. On the other hand, Gaussian filtering introduces an undesirable distortion in IQA. In our method, YCbCr color space is applied, since it is suggested by ITU-R BT.601 for video broadcasting to obtain Bilaplacian features. The direct conversion from RGB color space to YCbCr is the following: where R, G, and B denote the red, green, and blue color channels, respectively.
Generally, the Laplacian filters are approximated by convolution kernels whose sum are zero [51]. In this paper, the following popular kernels are utilized: An image can be converted to the Bilaplacian domain by convolving it with two Laplacian kernels, formally can be written as: where * stands for the operation of convolution. In our study, L 2 11 , L 2 22 , L 2 33 , L 2 44 , L 2 55 , L 2 13 , and L 2 24 masks are considered. As already mentioned, the channels of YCbCr color space are used to obtain the Bilaplacian features. This means that Y, Cb, and Cr channels are convolved with the Bilaplacian masks independently from each other. As a consequence, seven Bilaplacian maps can be obtained for each color channel. Subsequently, the histogram variance of each channels is taken. The histogram variance is defined as where h(v) stands for v's normalized histogram to unit sum. Figure 3 illustrates a reference, distortion-free image and its artificially distorted counterpart from the KADID-10k [46] database. It can be seen that even a moderate amount of noise can significantly distort the normalized histogram of Bilaplacian feature maps. That is why the histogram variances of the Bilaplacian feature maps were applied as quality-aware features.
First, the Sobel operator computes an approximation of the gradient of an image. If I is considered as the source image, G x and G y are determined as: where * stands for the convolution operator, G x and G y are the horizontal and vertical derivative approximations, respectively. The gradient magnitude approximations can be obtained: The binary Sobel edge map is determined by thresholding G using the quadruple of G's mean as cutoff threshold. Finally, edge thinning is applied to remove spurious points from the edge map [54]. The central moments of the digital image I(x, y) are defined as wherex andȳ are the coordinates of the binary image's centroid. By definition, the centroid of a binary image is the arithmetic mean of all (x, y) coordinates. It can be shown that central moments are translational invariant [55] (Figure 4).

Gradient Features
Image gradient magnitude and orientation features have become very popular both in FR-IQA and NR-IQA since they are strong predictive factors of perceptual image quality [36]. In this study, the histogram variances of gradient magnitude (GM), relative gradient orientation (RO), and relative gradient magnitude (RM) are incorporated into our model to quantify the changes in gradient [36].

Perceptual Features
The following perceptual features are adopted in our model, since they are coherent with the HVS's quality perception. Specifically, colorfulness [37], sharpness [38], dark channel feature [39], and contrast [40] were applied in our study.
Yendrikhovskij et al. [56] demonstrated that colorfulness plays an important role in human perceptual quality judgments, since humans like better more colorful images. In this study, the metric of Hasler and Suesstrunk [37] was adopted: where rg = R − G and yb = 1 2 (R + G) − B. Furthermore, R, G, and B stand for the red, green, and blue channels, respectively. Variables σ and µ denote the standard deviation and mean of the matrices given in the subscripts, respectively.
Image sharpness determines the amount of detail that is realized in the image. Sharpness can be observed most clearly on image edges and for that reason it is widely considered as an image quality factor. In this study, the metric of Bahrami and Kot [38]-maximum local variation (MLV)-was adopted to characterize the sharpness of an image because its low computational costs.
First, Tang et al. [57] proposed dark channel features for photo quality assessment. Dark channel features were designed originally for single image haze removal [39]. An image's (I) dark channel (I dark ) is defined as: where I c is a color channel of I (c ∈ {R, G, B}) and Ω(i) is a neighborhood of pixel i. In our implementation, Ω(i) is a rectangular 15 × 15-sized patch. The dark channel feature of image I is defined as: where S denotes the area of the input image.
There are many definitions of image contrast in the literature. The easiest way to explain contrast is the difference between the brightest and darkest pixel values. Therefore, the HVS's capability to recognize and separate objects on an image heavily depends on image contrast. Consequently, contrast is an image quality factor. In this study, Matkovic et al.'s [40] global contrast factor (GCF) model was adopted which is defined as follows: where w i = (−0.406385 · i 9 + 0.334573) · i 9 + 0.0877526, i ∈ {1, 2, . . . , 9}. Moreover, C i s are defined as where w and h stand for the width and height of the input image, respectively, and where the Ls denote the pixel values after gamma correction (γ = 2.2) and assuming that the image is reshaped into a row-wise one dimensional array. Table 3 illustrates the average values of the applied perceptual features(CF, sharpness, DCF, and GCF) in the KADID-10k [46] database with respect to the five different distortion levels. It can be observed that the applied four perceptual features strongly correlate with the distortion levels.

Experimental Results
In this section, our experimental results are presented. Section 3.1 gives a brief overview about the used publicly available IQA benchmark databases. Next, Section 3.2 describes the used experimental setup and evaluation metrics. Section 3.3 contains a parameter study in which our design choices are reasoned. Subsequently, Section 3.4 consists of a performance comparison to other state-of-the-art NR-IQA algorithms using publicly available IQA benchmark databases. Finally, Sections 3.5 and 3.6 contain detailed results with respect to distortion types and levels.
CSIQ [58] has 30 reference images, each one distorted by one of six predefined distortion types at four or five different distortion levels. MDID [59] contains 20 reference images and 1600 distorted images derived from the reference images using multiple distortions of random types and distortion levels. Moreover, the authors [59] proposed a novel subjective rating method, called pair comparison sorting, to obtain more accurate data. KADID-10k [46] consists of 10,125 distorted images derived from 81 pristine (distortion free), reference images using 25 different distortion types at 5 different distortion levels. Moreover, each image is associated with a differential MOS value in the range of [1,5]. In contrast, LIVE In the Wild [35] database contains images captured by mobile camera devices so the images are affected by an intricate mixture of different distortion types. In total, it contains 1162 authentically distorted images which were evaluated by 8100 human observers. Similarly, KonIQ-10k [60] database consists of digital images with authentic distortions. Specifically, 10,073 images were sampled from the YFCC100m [61] database using seven quality indicators, one content indicator, and machine tags. Moreover, 120 quality ratings were collected for all images using crowd sourcing platforms. Table 4 presents a comparison of the applied IQA benchmark databases with respect to their main characteristics.  [35] 1162 authentic --500 × 500 KonIQ-10k [60] 10,073 authentic --1024 × 768

Experimental Setup and Evaluation Metrics
To evaluate our model and other state-of-the-art algorithms, databases containing artificial distortions (CSIQ [58], MDID, and KADID-10k [46]) are divided into a training set and a test with respect to the pristine, reference images to avoid any semantic content overlapping between these two sets. Databases with authentic distortions (LIVE In the Wild) are simply divided into a training and a test set. Moreover, approximately 80% of images are in the training set and the remaining 20% are in the test. In this study, two widely applied correlation criteria are employed including Pearson's linear correlation coefficient (PLCC) and Spearman's rank order correlation coefficient (SROCC). For both PLCC and SROCC, a higher value indicates a better performance of the examined NR-IQA algorithm. Furthermore, we report average PLCC and SROCC values which were measured over 100 random train-test splits.

Parameter Study
In this subsection, a parameter study is carried out to find an optimal regression technique for the proposed quality-aware global statistical features. Specifically, we made experiments with five different regression algorithms, such as rational quadratic Gaussian process regressor (GPR) [62], Gaussian support vector regressor (SVR) [63], linear SVR [63], binary tree regression (BTR) [64], and random forest regression (RFR) [65]. The results are summarized in Figure 5. It can be seen that rational quadratic GPR provides the best performance on CSIQ [58], KADID-10k [46], and LIVE In the Wild [35]. On MDID [59], RFR provides the best results, while rational quadratic GPR is the second best. On KonIQ-10k [60], rational quadratic GPR and RFR give similar results. As a consequence, rational quadratic GPR was chosen in our method. Moreover, this architecture is codenamed GSF-IQA in the following subsections and compared to the state-of-the-art.
As already mentioned, five benchmark IQA databases are used in this study: CSIQ [58], MDID [59], KADID-10k [46], LIVE In the Wild [35], and KonIQ-10k [60]. The measured results of the proposed method and other state-of-the-art algorithms on artificial distortions (CSIQ [58], MDID [59], and KADID-10K [46]) can be seen in Table 5, while those on authentic distortions (LIVE In the Wild [35] and KonIQ-10k [60]) are summarized in Table 6. In addition to this, Table 7 presents the results of the one-sided t-test which was applied to give evidence for the statistical significance of GSF-IQA's results on the used IQA benchmark databases. In this table, each record is encoded by two symbols. Namely, '1' means that the proposed GSF-IQA method is statistically significantly better than the NR-IQA method in the row on the IQA benchmark database in the column. The '-' symbol is adopted when there is no significant difference between GSF-IQA and another NR-IQA method. Table 8 illustrates the weighted and direct average of PLCC and SROCC values found in Tables 5 and 6.
From the results presented in Tables 5-8, it can be seen that the proposed GSF-IQA provides the best results on four out of five IQA benchmark databases. Moreover, it gives the second best PLCC and SROCC values on LIVE In the Wild [35]. From the significance tests, it can be observed that the improvement is statistically significant on all databases containing artificial distortions. On the other hand, the difference between the best and the second best performing methods on LIVE In the Wild [35] and KonIQ-10k [60] is statistically not significant. It can be also observed from Table 8 that the proposed GSF-IQA method is able to outperform other state-of-the-art algorithms in terms of direct and weighted average PLCC and SROCC values. Specifically, GSF-IQA outperforms the second best method by approximately 0.02 both in terms of direct and weighted average PLCC and SROCC values. Figure 6 depicts the boxplots of PLCC and SROCC values produced by GSF-IQA on each applied IQA benchmark database. On every box, the red central mark stands for the median value, and the blue bottom and top edges of the box denote the 25th and 75th percentiles, respectively. In addition, the whiskers indicate the most extreme values which are not considered as outliers. The outliers are depicted by '+'.  Table 7. One-sided t-test. Symbol '1' means that the proposed GSF-IQA method is statistically better than the NR-IQA method in the row on the IQA benchmark database in the column. Symbol '-' is used when there is no significant difference. 1 1 1 1 1  Figure 6. Box plots of the PLCC and SROCC values produced by GSF-IQA on five IQA benchmark databases (CSIQ [58], MDID [59], KADID-10k [46], LIVE In the Wild [35], and KonIQ-10k [60]). Measured over 100 random train-test splits.

Performance over Different Distortion Types
In this subsection, we examine the performance of the state-of-the-art NR-IQA methods over different distortion types. Specifically, we report on average SROCC values measured over the different distortion types of KADID-10k database [46]. As already mentioned, this database consists of images with 25 different distortion types, such as Gaussian blur (GB), lens blur (LB), motion blur (MB), color diffusion (CD), color shift (CS), color quantization (CQ), color saturation 1 (CSA1), color saturation 2 (CSA2), JPEG2000 compression noise (JP2K), JPEG compression noise (JPEG), white noise (WN), white noise in color component (WNCC), impulse noise (IN), multiplicative noise (MN), denoise, brighten, darken, mean shift (MS), jitter, non-eccentricity patch (NEP), pixelate, quantization, color block (CB), high sharpen (HS), and contrast change (CC). The results are summarized in Table 9. It can be seen that the proposed GSF-IQA algorithm is able to provide the best results on 12 out of 25 distortion types.

Performance over Different Distortion Levels
In this subsection, we examine the performance of the state-of-the-art NR-IQA methods over different distortion levels. Specifically, we report on average SROCC values measured over the different distortion levels of the KADID-10k database [46]. The results are summarized in Table 10. As one can see from the results, the proposed GSF-IQA algorithm is able to outperform all the other state-of-the-art methods on all distortion levels.

Conclusions
In this paper, we proposed a novel NR-IQA algorithm based on a set of novel qualityaware features which globally characterizes the statistics of an image. First, we utilized that various image distortions change the local regularities of the texture. Thus, an extended local fractal dimension feature was proposed to quantify the texture's degradation. Second, we demonstrated that first digit distributions of wavelet coefficients, DCT coefficients, and singular values can be used as quality-aware features and proposed extended first digit distribution feature vectors. This model was improved by Bilaplacian features which was inspired by the extended classical receptive field model of retinal ganglion cells. To quantify the degradation of edges, image moments were incorporated into the model. The proposed algorithm was tested on five publicly available benchmark databases including CSIQ, MDID, KADID-10k, LIVE In the Wild, and KonIQ-10k. It was demonstrated that our proposal is able to outperform other state-of-the-art methods both on artificial and authentic distortions. There are two main directions of future research. Beyond feature concatenation, it is worth to study the selection process of relevant attributes provided by different sources. Moreover, the incorporation of local statistical features provided by local feature descriptors may improve the performance, since some distortion types do not uniformly distribute in the image.
To facilitate the reproducibility of the presented results, the source code of the proposed method and test environments written in MATLAB R2020a environment are available at: https://github.com/Skythianos/GSF-IQA, accessed on 5 February 2021.
Funding: This research received no external funding.

Data Availability Statement:
No new data were created or analysed in this study. Data sharing is not applicable to this article.